[ 
https://issues.apache.org/jira/browse/HBASE-19290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16263089#comment-16263089
 ] 

Appy commented on HBASE-19290:
------------------------------

{quote}
bq. I can see it's doing that. The question really meant - why such interesting 
choice? It's not usual thing to do i.e. throttle for first request and start 
hammering servers after that. If it was something you chose by design - please 
add a comment about the behavior and explaining reasoning; if not by design, 
then probably remove the 'if condition' and always sleep.
 The design is not chose by me, and i do not see this design has any problem, 
there are 2 available splitters, and grabbed one task then try to grab the 
second task and doing split task as fast as possible. My patch do not change 
the design, and just trying to issue less zk request.
{quote}
The patch is adding throttling, so it's certainly changing the way things work. 
It's also adding the 'if' condition controlling throttling, so the choice is 
definitely being made by you.
I suspect you are using grabbedTask=0 as a proxy for 'failed to grab task' and 
wait on it. But when grabbedTask =1, and we still keep failing to grab tasks, 
there is no throttling for that case! Hopefully that makes my question clearer?


> Reduce zk request when doing split log
> --------------------------------------
>
>                 Key: HBASE-19290
>                 URL: https://issues.apache.org/jira/browse/HBASE-19290
>             Project: HBase
>          Issue Type: Improvement
>            Reporter: binlijin
>            Assignee: binlijin
>         Attachments: HBASE-19290.master.001.patch, 
> HBASE-19290.master.002.patch, HBASE-19290.master.003.patch
>
>
> We observe once the cluster has 1000+ nodes and when hundreds of nodes abort 
> and doing split log, the split is very very slow, and we find the 
> regionserver and master wait on the zookeeper response, so we need to reduce 
> zookeeper request and pressure for big cluster.
> (1) Reduce request to rsZNode, every time calculateAvailableSplitters will 
> get rsZNode's children from zookeeper, when cluster is huge, this is heavy. 
> This patch reduce the request. 
> (2) When the regionserver has max split tasks running, it may still trying to 
> grab task and issue zookeeper request, we should sleep and wait until we can 
> grab tasks again.  



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to