[
https://issues.apache.org/jira/browse/HBASE-4993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13237196#comment-13237196
]
Jean-Daniel Cryans commented on HBASE-4993:
-------------------------------------------
@Nic
I think a bug was introduced here. Here's the new waiting logic in
waitForRegionServers:
{code}
- the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
there have been no new region server in for
'hbase.master.wait.on.regionservers.interval' time
{code}
And the code that verifies that:
{code}
!(lastCountChange+interval > now && count >= minToStart)
{code}
If you have 0 region servers that checked in and you are under the interval,
you wait: not (true and false) = true.
If you have 0 region servers but you are above the interval, you wait: not
(false and false) = true.
If you have 1 or more region servers that checked in and you are under the
interval, you continue: not (true and true) = false.
Here's an example:
{noformat}
2012-03-23 21:45:22,002 INFO org.apache.hadoop.hbase.master.ServerManager:
Waiting for region servers count to settle; currently checked in 0, slept for 0
ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval
of 1500 ms.
2012-03-23 21:45:22,882 INFO org.apache.hadoop.hbase.master.ServerManager:
Registering server=sv4r27s44,62023,1332539122398
2012-03-23 21:45:22,883 INFO org.apache.hadoop.hbase.master.ServerManager:
Registering server=sv4r29s44,62023,1332539122438
2012-03-23 21:45:22,883 INFO org.apache.hadoop.hbase.master.ServerManager:
Registering server=sv4r25s44,62023,1332539122404
2012-03-23 21:45:22,885 INFO org.apache.hadoop.hbase.master.ServerManager:
Registering server=sv4r6s38,62023,1332539122354
2012-03-23 21:45:22,885 INFO org.apache.hadoop.hbase.master.ServerManager:
Registering server=sv4r8s38,62023,1332539122396
2012-03-23 21:45:22,886 INFO org.apache.hadoop.hbase.master.ServerManager:
Registering server=sv4r5s38,62023,1332539122427
2012-03-23 21:45:22,886 INFO org.apache.hadoop.hbase.master.ServerManager:
Registering server=sv4r28s44,62023,1332539122402
2012-03-23 21:45:22,887 INFO org.apache.hadoop.hbase.master.ServerManager:
Registering server=sv4r31s44,62023,1332539122387
2012-03-23 21:45:22,887 INFO org.apache.hadoop.hbase.master.ServerManager:
Registering server=sv4r30s44,62023,1332539122392
2012-03-23 21:45:22,906 INFO org.apache.hadoop.hbase.master.ServerManager:
Finished waiting for region servers count to settle; checked in 9, slept for
904 ms, expecting minimum of 1, maximum of 2147483647, master is running.
{noformat}
As you can see we haven't waited a second and the master is proceeding. This is
here not too bad because in the cluster I have 9 servers, but the first time I
ran 0.94 it proceeded with only 1 server. This could be disastrous at scale, we
really need to wait more than that here. In fact I think I preferred the old
way of doing it.
> Performance regression in minicluster creation
> ----------------------------------------------
>
> Key: HBASE-4993
> URL: https://issues.apache.org/jira/browse/HBASE-4993
> Project: HBase
> Issue Type: Bug
> Components: master
> Affects Versions: 0.94.0
> Environment: all
> Reporter: nkeywal
> Assignee: nkeywal
> Fix For: 0.94.0
>
> Attachments: 4993.patch, 4993.v3.patch
>
>
> Side effect of 4610: the mini cluster needs 4,5 seconds to start
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira