[jira] [Comment Edited] (HBASE-12790) Support fairness across parallelized scans

Samarth Jain (JIRA) Tue, 07 Mar 2017 14:08:46 -0800

    [ 
https://issues.apache.org/jira/browse/HBASE-12790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15900259#comment-15900259
 ]


Samarth Jain edited comment on HBASE-12790 at 3/7/17 10:07 PM:
---------------------------------------------------------------

Thanks for the pointer [~anoop.hbase]. I see that there is already an executor 
RWQueueRpcExecutor that gets hold of the ScanRequest in the dispatch method. 

{code}
private boolean isScanRequest(final RequestHeader header, final Message param) {
    if (param instanceof ScanRequest) {
      // The first scan request will be executed as a "short read"
      ScanRequest request = (ScanRequest)param;
      return request.hasScannerId();
    }
    return false;
  }

@Override
  public boolean dispatch(final CallRunner callTask) throws 
InterruptedException {
       ...
       if (numScanQueues > 0 && isScanRequest(call.getHeader(), call.param)) {
           queueIndex = numWriteQueues + numReadQueues + 
scanBalancer.getNextQueue();
       }
       ...
{code}

So yes, there is a way forward by utilizing the scan attribute for this purpose 
without having to add an API to Operation. Having said that, looking at the 
isWriteRequest method in the same class, I see that things can get 
gnarly/brittle/inefficient. 
{code}
private boolean isWriteRequest(final RequestHeader header, final Message param) 
{
    // TODO: Is there a better way to do this?
    if (param instanceof MultiRequest) {
      MultiRequest multi = (MultiRequest)param;
      for (RegionAction regionAction : multi.getRegionActionList()) {
        for (Action action: regionAction.getActionList()) {
          if (action.hasMutation()) {
            return true;
          }
        }
      }
    }

{code}

So the ideal would be to have a generic enough API to enable clients mark 
read/write requests for whatever they would want to do with it on the server 
side. 




was (Author: samarthjain):
Thanks for the pointer [~anoop.hbase]. I see that there is already an executor 
RWQueueRpcExecutor that gets hold of the ScanRequest in the dispatch method. 

{code}
private boolean isScanRequest(final RequestHeader header, final Message param) {
    if (param instanceof ScanRequest) {
      // The first scan request will be executed as a "short read"
      ScanRequest request = (ScanRequest)param;
      return request.hasScannerId();
    }
    return false;
  }

@Override
  public boolean dispatch(final CallRunner callTask) throws 
InterruptedException {
       ...
       if (numScanQueues > 0 && isScanRequest(call.getHeader(), call.param)) {
           queueIndex = numWriteQueues + numReadQueues + 
scanBalancer.getNextQueue();
       }
       ...
{code}

So yes, there is a way forward by utilizing the scan attribute for this purpose 
without having to add an API to Operation. Having said that, looking at the 
isWriteRequest method in the same class, I see that things can get 
gnarly/brittle/inefficient. 
{code}
private boolean isWriteRequest(final RequestHeader header, final Message param) 
{
    // TODO: Is there a better way to do this?
    if (param instanceof MultiRequest) {
      MultiRequest multi = (MultiRequest)param;
      for (RegionAction regionAction : multi.getRegionActionList()) {
        for (Action action: regionAction.getActionList()) {
          if (action.hasMutation()) {
            return true;
          }
        }
      }
    }

{code}



> Support fairness across parallelized scans
> ------------------------------------------
>
>                 Key: HBASE-12790
>                 URL: https://issues.apache.org/jira/browse/HBASE-12790
>             Project: HBase
>          Issue Type: New Feature
>            Reporter: James Taylor
>            Assignee: ramkrishna.s.vasudevan
>              Labels: Phoenix
>         Attachments: AbstractRoundRobinQueue.java, HBASE-12790_1.patch, 
> HBASE-12790_5.patch, HBASE-12790_callwrapper.patch, HBASE-12790.patch, 
> HBASE-12790_trunk_1.patch, PHOENIX_4.5.3-HBase-0.98-2317-SNAPSHOT.zip
>
>
> Some HBase clients parallelize the execution of a scan to reduce latency in 
> getting back results. This can lead to starvation with a loaded cluster and 
> interleaved scans, since the RPC queue will be ordered and processed on a 
> FIFO basis. For example, if there are two clients, A & B that submit largish 
> scans at the same time. Say each scan is broken down into 100 scans by the 
> client (broken down into equal depth chunks along the row key), and the 100 
> scans of client A are queued first, followed immediately by the 100 scans of 
> client B. In this case, client B will be starved out of getting any results 
> back until the scans for client A complete.
> One solution to this is to use the attached AbstractRoundRobinQueue instead 
> of the standard FIFO queue. The queue to be used could be (maybe it already 
> is) configurable based on a new config parameter. Using this queue would 
> require the client to have the same identifier for all of the 100 parallel 
> scans that represent a single logical scan from the clients point of view. 
> With this information, the round robin queue would pick off a task from the 
> queue in a round robin fashion (instead of a strictly FIFO manner) to prevent 
> starvation over interleaved parallelized scans.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

[jira] [Comment Edited] (HBASE-12790) Support fairness across parallelized scans

Reply via email to