[ 
https://issues.apache.org/jira/browse/HBASE-19290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16268045#comment-16268045
 ] 

binlijin commented on HBASE-19290:
----------------------------------

bq. Can you please walk me through a full case explaining need of that "if" 
condition and why value of grabbedTask=0 is special.
The task loop is the following stage
(1) getTaskList, get tasks from zookeeper splitLogZNode node.
 issue zookeeper request
(2) for loop trying to grab task for every tasks  
  issue zookeeper request
(3) while loop
  sleep when seq_start == taskReadySeq.get(), else skip
When grabbedTask=0 and skip stage 3 the while loop, regionserver will trying to 
grab tasks again and may not get any tasks and giving pressure to zookeeper, so 
throttling it. 
When grabbedTask=1 and skip stage 3 the while loop, regionserver will trying to 
grab tasks again and if not get any tasks this round so will throttle, and if 
get any task may try grab tasks next round again.


> Reduce zk request when doing split log
> --------------------------------------
>
>                 Key: HBASE-19290
>                 URL: https://issues.apache.org/jira/browse/HBASE-19290
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: binlijin
>            Assignee: binlijin
>         Attachments: HBASE-19290.master.001.patch, 
> HBASE-19290.master.002.patch, HBASE-19290.master.003.patch, 
> HBASE-19290.master.004.patch
>
>
> We observe once the cluster has 1000+ nodes and when hundreds of nodes abort 
> and doing split log, the split is very very slow, and we find the 
> regionserver and master wait on the zookeeper response, so we need to reduce 
> zookeeper request and pressure for big cluster.
> (1) Reduce request to rsZNode, every time calculateAvailableSplitters will 
> get rsZNode's children from zookeeper, when cluster is huge, this is heavy. 
> This patch reduce the request. 
> (2) When the regionserver has max split tasks running, it may still trying to 
> grab task and issue zookeeper request, we should sleep and wait until we can 
> grab tasks again.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to