[jira] [Commented] (HBASE-5639) The logic used in waiting for region servers during startup is broken
[ https://issues.apache.org/jira/browse/HBASE-5639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13240891#comment-13240891 ] Hudson commented on HBASE-5639: --- Integrated in HBase-TRUNK #2697 (See [https://builds.apache.org/job/HBase-TRUNK/2697/]) HBASE-5639 The logic used in waiting for region servers during startup is broken (J-D and NKeyval) (Revision 1306012) Result = FAILURE larsh : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java The logic used in waiting for region servers during startup is broken - Key: HBASE-5639 URL: https://issues.apache.org/jira/browse/HBASE-5639 Project: HBase Issue Type: Bug Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.94.0 Attachments: HBASE-5639.patch See the tail of HBASE-4993, which I'll report here: Me: {quote} I think a bug was introduced here. Here's the new waiting logic in waitForRegionServers: the 'hbase.master.wait.on.regionservers.mintostart' is reached AND there have been no new region server in for 'hbase.master.wait.on.regionservers.interval' time And the code that verifies that: !(lastCountChange+interval now count = minToStart) {quote} Nic: {quote} It seems that changing the code to (count minToStart || lastCountChange+interval now) would make the code works as documented. If you have 0 region servers that checked in and you are under the interval, you wait: (true or true) = true. If you have 0 region servers but you are above the interval, you wait: (true or false) = true. If you have 1 or more region servers that checked in and you are under the interval, you wait: (false or true) = true. {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5639) The logic used in waiting for region servers during startup is broken
[ https://issues.apache.org/jira/browse/HBASE-5639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239661#comment-13239661 ] Lars Hofhansl commented on HBASE-5639: -- Ready for commit? I'm happy to commit, since I'd have to update CHANGES.txt as well (as this will be an RC candidate build). The logic used in waiting for region servers during startup is broken - Key: HBASE-5639 URL: https://issues.apache.org/jira/browse/HBASE-5639 Project: HBase Issue Type: Bug Reporter: Jean-Daniel Cryans Assignee: nkeywal Priority: Blocker Fix For: 0.94.0 Attachments: HBASE-5639.patch See the tail of HBASE-4993, which I'll report here: Me: {quote} I think a bug was introduced here. Here's the new waiting logic in waitForRegionServers: the 'hbase.master.wait.on.regionservers.mintostart' is reached AND there have been no new region server in for 'hbase.master.wait.on.regionservers.interval' time And the code that verifies that: !(lastCountChange+interval now count = minToStart) {quote} Nic: {quote} It seems that changing the code to (count minToStart || lastCountChange+interval now) would make the code works as documented. If you have 0 region servers that checked in and you are under the interval, you wait: (true or true) = true. If you have 0 region servers but you are above the interval, you wait: (true or false) = true. If you have 1 or more region servers that checked in and you are under the interval, you wait: (false or true) = true. {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5639) The logic used in waiting for region servers during startup is broken
[ https://issues.apache.org/jira/browse/HBASE-5639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239964#comment-13239964 ] Hudson commented on HBASE-5639: --- Integrated in HBase-0.94 #61 (See [https://builds.apache.org/job/HBase-0.94/61/]) HBASE-5639 The logic used in waiting for region servers during startup is broken (J-D and NKeyval) (Revision 1306011) Result = FAILURE larsh : Files : * /hbase/branches/0.94/CHANGES.txt * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java The logic used in waiting for region servers during startup is broken - Key: HBASE-5639 URL: https://issues.apache.org/jira/browse/HBASE-5639 Project: HBase Issue Type: Bug Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.94.0 Attachments: HBASE-5639.patch See the tail of HBASE-4993, which I'll report here: Me: {quote} I think a bug was introduced here. Here's the new waiting logic in waitForRegionServers: the 'hbase.master.wait.on.regionservers.mintostart' is reached AND there have been no new region server in for 'hbase.master.wait.on.regionservers.interval' time And the code that verifies that: !(lastCountChange+interval now count = minToStart) {quote} Nic: {quote} It seems that changing the code to (count minToStart || lastCountChange+interval now) would make the code works as documented. If you have 0 region servers that checked in and you are under the interval, you wait: (true or true) = true. If you have 0 region servers but you are above the interval, you wait: (true or false) = true. If you have 1 or more region servers that checked in and you are under the interval, you wait: (false or true) = true. {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5639) The logic used in waiting for region servers during startup is broken
[ https://issues.apache.org/jira/browse/HBASE-5639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239966#comment-13239966 ] Hudson commented on HBASE-5639: --- Integrated in HBase-0.94-security #5 (See [https://builds.apache.org/job/HBase-0.94-security/5/]) HBASE-5639 The logic used in waiting for region servers during startup is broken (J-D and NKeyval) (Revision 1306011) Result = SUCCESS larsh : Files : * /hbase/branches/0.94/CHANGES.txt * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java The logic used in waiting for region servers during startup is broken - Key: HBASE-5639 URL: https://issues.apache.org/jira/browse/HBASE-5639 Project: HBase Issue Type: Bug Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.94.0 Attachments: HBASE-5639.patch See the tail of HBASE-4993, which I'll report here: Me: {quote} I think a bug was introduced here. Here's the new waiting logic in waitForRegionServers: the 'hbase.master.wait.on.regionservers.mintostart' is reached AND there have been no new region server in for 'hbase.master.wait.on.regionservers.interval' time And the code that verifies that: !(lastCountChange+interval now count = minToStart) {quote} Nic: {quote} It seems that changing the code to (count minToStart || lastCountChange+interval now) would make the code works as documented. If you have 0 region servers that checked in and you are under the interval, you wait: (true or true) = true. If you have 0 region servers but you are above the interval, you wait: (true or false) = true. If you have 1 or more region servers that checked in and you are under the interval, you wait: (false or true) = true. {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5639) The logic used in waiting for region servers during startup is broken
[ https://issues.apache.org/jira/browse/HBASE-5639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13240212#comment-13240212 ] Hudson commented on HBASE-5639: --- Integrated in HBase-TRUNK-security #152 (See [https://builds.apache.org/job/HBase-TRUNK-security/152/]) HBASE-5639 The logic used in waiting for region servers during startup is broken (J-D and NKeyval) (Revision 1306012) Result = SUCCESS larsh : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java The logic used in waiting for region servers during startup is broken - Key: HBASE-5639 URL: https://issues.apache.org/jira/browse/HBASE-5639 Project: HBase Issue Type: Bug Reporter: Jean-Daniel Cryans Assignee: Jean-Daniel Cryans Priority: Blocker Fix For: 0.94.0 Attachments: HBASE-5639.patch See the tail of HBASE-4993, which I'll report here: Me: {quote} I think a bug was introduced here. Here's the new waiting logic in waitForRegionServers: the 'hbase.master.wait.on.regionservers.mintostart' is reached AND there have been no new region server in for 'hbase.master.wait.on.regionservers.interval' time And the code that verifies that: !(lastCountChange+interval now count = minToStart) {quote} Nic: {quote} It seems that changing the code to (count minToStart || lastCountChange+interval now) would make the code works as documented. If you have 0 region servers that checked in and you are under the interval, you wait: (true or true) = true. If you have 0 region servers but you are above the interval, you wait: (true or false) = true. If you have 1 or more region servers that checked in and you are under the interval, you wait: (false or true) = true. {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5639) The logic used in waiting for region servers during startup is broken
[ https://issues.apache.org/jira/browse/HBASE-5639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13238973#comment-13238973 ] Jean-Daniel Cryans commented on HBASE-5639: --- Oh I forgot to mention that I'm marking this as a blocker for 0.94.0 because right now if you start a sizable cluster you may end up with region servers that checkin too late and miss the re-assignment of regions. The logic used in waiting for region servers during startup is broken - Key: HBASE-5639 URL: https://issues.apache.org/jira/browse/HBASE-5639 Project: HBase Issue Type: Bug Reporter: Jean-Daniel Cryans Assignee: nkeywal Priority: Blocker Fix For: 0.94.0 See the tail of HBASE-4993, which I'll report here: Me: {quote} I think a bug was introduced here. Here's the new waiting logic in waitForRegionServers: the 'hbase.master.wait.on.regionservers.mintostart' is reached AND there have been no new region server in for 'hbase.master.wait.on.regionservers.interval' time And the code that verifies that: !(lastCountChange+interval now count = minToStart) {quote} Nic: {quote} It seems that changing the code to (count minToStart || lastCountChange+interval now) would make the code works as documented. If you have 0 region servers that checked in and you are under the interval, you wait: (true or true) = true. If you have 0 region servers but you are above the interval, you wait: (true or false) = true. If you have 1 or more region servers that checked in and you are under the interval, you wait: (false or true) = true. {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5639) The logic used in waiting for region servers during startup is broken
[ https://issues.apache.org/jira/browse/HBASE-5639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239155#comment-13239155 ] Lars Hofhansl commented on HBASE-5639: -- Are you planning to work on this, J-D? Agree with the blocker status. The logic used in waiting for region servers during startup is broken - Key: HBASE-5639 URL: https://issues.apache.org/jira/browse/HBASE-5639 Project: HBase Issue Type: Bug Reporter: Jean-Daniel Cryans Assignee: nkeywal Priority: Blocker Fix For: 0.94.0 See the tail of HBASE-4993, which I'll report here: Me: {quote} I think a bug was introduced here. Here's the new waiting logic in waitForRegionServers: the 'hbase.master.wait.on.regionservers.mintostart' is reached AND there have been no new region server in for 'hbase.master.wait.on.regionservers.interval' time And the code that verifies that: !(lastCountChange+interval now count = minToStart) {quote} Nic: {quote} It seems that changing the code to (count minToStart || lastCountChange+interval now) would make the code works as documented. If you have 0 region servers that checked in and you are under the interval, you wait: (true or true) = true. If you have 0 region servers but you are above the interval, you wait: (true or false) = true. If you have 1 or more region servers that checked in and you are under the interval, you wait: (false or true) = true. {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5639) The logic used in waiting for region servers during startup is broken
[ https://issues.apache.org/jira/browse/HBASE-5639?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13239190#comment-13239190 ] Lars Hofhansl commented on HBASE-5639: -- +1 on patch. Although my mind twisted like a pretzel thinking about the correct condition here. The logic used in waiting for region servers during startup is broken - Key: HBASE-5639 URL: https://issues.apache.org/jira/browse/HBASE-5639 Project: HBase Issue Type: Bug Reporter: Jean-Daniel Cryans Assignee: nkeywal Priority: Blocker Fix For: 0.94.0 Attachments: HBASE-5639.patch See the tail of HBASE-4993, which I'll report here: Me: {quote} I think a bug was introduced here. Here's the new waiting logic in waitForRegionServers: the 'hbase.master.wait.on.regionservers.mintostart' is reached AND there have been no new region server in for 'hbase.master.wait.on.regionservers.interval' time And the code that verifies that: !(lastCountChange+interval now count = minToStart) {quote} Nic: {quote} It seems that changing the code to (count minToStart || lastCountChange+interval now) would make the code works as documented. If you have 0 region servers that checked in and you are under the interval, you wait: (true or true) = true. If you have 0 region servers but you are above the interval, you wait: (true or false) = true. If you have 1 or more region servers that checked in and you are under the interval, you wait: (false or true) = true. {quote} -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira