[
https://issues.apache.org/jira/browse/HBASE-19290?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16260196#comment-16260196
]
binlijin commented on HBASE-19290:
----------------------------------
bq. Why randomize? Can be constant?
No particular reason for randomize, i change it to constant.
bq. So there are 2 available splitters, and one grabbed task, we don't stop
here and keep hammering zk?
Yes.
bq. Can do it in "if" condition itself?
Yes, it can do, done it.
bq. That while condition is just to handle spurious wakeups. See Object#wait.
You can definitely remove the second sleep (unless there's a concrete reason
not to).
The while loop will enter only if when seq_start == taskReadySeq.get(), and
when every splitLogZNode's children changed the taskReadySeq will increment, so
it will not enter the while (seq_start == taskReadySeq.get()) {} and kill
trying to grab task and issue zk request.
> Reduce zk request when doing split log
> --------------------------------------
>
> Key: HBASE-19290
> URL: https://issues.apache.org/jira/browse/HBASE-19290
> Project: HBase
> Issue Type: Improvement
> Reporter: binlijin
> Assignee: binlijin
> Attachments: HBASE-19290.master.001.patch,
> HBASE-19290.master.002.patch, HBASE-19290.master.003.patch
>
>
> We observe once the cluster has 1000+ nodes and when hundreds of nodes abort
> and doing split log, the split is very very slow, and we find the
> regionserver and master wait on the zookeeper response, so we need to reduce
> zookeeper request and pressure for big cluster.
> (1) Reduce request to rsZNode, every time calculateAvailableSplitters will
> get rsZNode's children from zookeeper, when cluster is huge, this is heavy.
> This patch reduce the request.
> (2) When the regionserver has max split tasks running, it may still trying to
> grab task and issue zookeeper request, we should sleep and wait until we can
> grab tasks again.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)