[ https://issues.apache.org/jira/browse/HBASE-5639?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jean-Daniel Cryans reassigned HBASE-5639: ----------------------------------------- Assignee: Jean-Daniel Cryans (was: nkeywal) Here's what I see now with the patch: {noformat} 2012-03-27 18:53:07,644 INFO org.apache.hadoop.hbase.master.ServerManager: Waiting for region servers count to settle; currently checked in 0, slept for 0 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms. 2012-03-27 18:53:08,638 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=sv4r29s44,62023,1332874388301 2012-03-27 18:53:08,638 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=sv4r27s44,62023,1332874388324 2012-03-27 18:53:08,649 INFO org.apache.hadoop.hbase.master.ServerManager: Waiting for region servers count to settle; currently checked in 2, slept for 1005 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms. 2012-03-27 18:53:08,656 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=sv4r5s38,62023,1332874388319 2012-03-27 18:53:08,657 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=sv4r6s38,62023,1332874388364 2012-03-27 18:53:08,662 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=sv4r8s38,62023,1332874388371 2012-03-27 18:53:08,699 INFO org.apache.hadoop.hbase.master.ServerManager: Waiting for region servers count to settle; currently checked in 5, slept for 1055 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms. 2012-03-27 18:53:08,897 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=sv4r31s44,62023,1332874388453 2012-03-27 18:53:08,900 INFO org.apache.hadoop.hbase.master.ServerManager: Waiting for region servers count to settle; currently checked in 6, slept for 1256 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms. 2012-03-27 18:53:09,602 INFO org.apache.hadoop.hbase.master.ServerManager: Registering server=sv4r30s44,62023,1332874388969 2012-03-27 18:53:09,603 INFO org.apache.hadoop.hbase.master.ServerManager: Waiting for region servers count to settle; currently checked in 7, slept for 1959 ms, expecting minimum of 1, maximum of 2147483647, timeout of 4500 ms, interval of 1500 ms. 2012-03-27 18:53:11,110 INFO org.apache.hadoop.hbase.master.ServerManager: Finished waiting for region servers count to settle; checked in 7, slept for 3466 ms, expecting minimum of 1, maximum of 2147483647, master is running. {noformat} It confirms it did the right thing, go wild Lars :) > The logic used in waiting for region servers during startup is broken > --------------------------------------------------------------------- > > Key: HBASE-5639 > URL: https://issues.apache.org/jira/browse/HBASE-5639 > Project: HBase > Issue Type: Bug > Reporter: Jean-Daniel Cryans > Assignee: Jean-Daniel Cryans > Priority: Blocker > Fix For: 0.94.0 > > Attachments: HBASE-5639.patch > > > See the tail of HBASE-4993, which I'll report here: > Me: > {quote} > I think a bug was introduced here. Here's the new waiting logic in > waitForRegionServers: > the 'hbase.master.wait.on.regionservers.mintostart' is reached AND > there have been no new region server in for > 'hbase.master.wait.on.regionservers.interval' time > And the code that verifies that: > !(lastCountChange+interval > now && count >= minToStart) > {quote} > Nic: > {quote} > It seems that changing the code to > (count < minToStart || > lastCountChange+interval > now) > would make the code works as documented. > If you have 0 region servers that checked in and you are under the interval, > you wait: (true or true) = true. > If you have 0 region servers but you are above the interval, you wait: (true > or false) = true. > If you have 1 or more region servers that checked in and you are under the > interval, you wait: (false or true) = true. > {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira