[ 
https://issues.apache.org/jira/browse/HBASE-4993?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13237196#comment-13237196
 ] 

Jean-Daniel Cryans commented on HBASE-4993:
-------------------------------------------

@Nic

I think a bug was introduced here. Here's the new waiting logic in 
waitForRegionServers:

{code}
- the 'hbase.master.wait.on.regionservers.mintostart' is reached AND
   there have been no new region server in for
      'hbase.master.wait.on.regionservers.interval' time
{code}

And the code that verifies that:

{code}
  !(lastCountChange+interval > now && count >= minToStart)
{code}

If you have 0 region servers that checked in and you are under the interval, 
you wait: not (true and false) = true.
If you have 0 region servers but you are above the interval, you wait: not 
(false and false) = true.
If you have 1 or more region servers that checked in and you are under the 
interval, you continue: not (true and true) = false.

Here's an example:

{noformat}
2012-03-23 21:45:22,002 INFO org.apache.hadoop.hbase.master.ServerManager: 
Waiting for region servers count to settle; currently checked in 0, slept for 0 
ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval 
of 1500 ms.
2012-03-23 21:45:22,882 INFO org.apache.hadoop.hbase.master.ServerManager: 
Registering server=sv4r27s44,62023,1332539122398
2012-03-23 21:45:22,883 INFO org.apache.hadoop.hbase.master.ServerManager: 
Registering server=sv4r29s44,62023,1332539122438
2012-03-23 21:45:22,883 INFO org.apache.hadoop.hbase.master.ServerManager: 
Registering server=sv4r25s44,62023,1332539122404
2012-03-23 21:45:22,885 INFO org.apache.hadoop.hbase.master.ServerManager: 
Registering server=sv4r6s38,62023,1332539122354
2012-03-23 21:45:22,885 INFO org.apache.hadoop.hbase.master.ServerManager: 
Registering server=sv4r8s38,62023,1332539122396
2012-03-23 21:45:22,886 INFO org.apache.hadoop.hbase.master.ServerManager: 
Registering server=sv4r5s38,62023,1332539122427
2012-03-23 21:45:22,886 INFO org.apache.hadoop.hbase.master.ServerManager: 
Registering server=sv4r28s44,62023,1332539122402
2012-03-23 21:45:22,887 INFO org.apache.hadoop.hbase.master.ServerManager: 
Registering server=sv4r31s44,62023,1332539122387
2012-03-23 21:45:22,887 INFO org.apache.hadoop.hbase.master.ServerManager: 
Registering server=sv4r30s44,62023,1332539122392
2012-03-23 21:45:22,906 INFO org.apache.hadoop.hbase.master.ServerManager: 
Finished waiting for region servers count to settle; checked in 9, slept for 
904 ms, expecting minimum of 1, maximum of 2147483647, master is running.
{noformat}

As you can see we haven't waited a second and the master is proceeding. This is 
here not too bad because in the cluster I have 9 servers, but the first time I 
ran 0.94 it proceeded with only 1 server. This could be disastrous at scale, we 
really need to wait more than that here. In fact I think I preferred the old 
way of doing it.
                
> Performance regression in minicluster creation
> ----------------------------------------------
>
>                 Key: HBASE-4993
>                 URL: https://issues.apache.org/jira/browse/HBASE-4993
>             Project: HBase
>          Issue Type: Bug
>          Components: master
>    Affects Versions: 0.94.0
>         Environment: all
>            Reporter: nkeywal
>            Assignee: nkeywal
>             Fix For: 0.94.0
>
>         Attachments: 4993.patch, 4993.v3.patch
>
>
> Side effect of 4610: the mini cluster needs 4,5 seconds to start

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to