[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13268877#comment-13268877 ] Hudson commented on HBASE-5849: --- Integrated in HBase-0.92-security #106 (See [https://builds.apache.org/job/HBase-0.92-security/106/]) HBASE-5849 On first cluster startup, RS aborts if root znode is not available; REAPPLY (Revision 1330119) HBASE-5849 On first cluster startup, RS aborts if root znode is not available; REAPPLY (Revision 1330118) HBASE-5849 On first cluster startup, RS aborts if root znode is not available; REVERT (Revision 1329562) HBASE-5849 On first cluster startup, RS aborts if root znode is not available (Revision 1329548) HBASE-5849 On first cluster startup, RS aborts if root znode is not available (Revision 1329530) Result = SUCCESS stack : Files : * /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/TestClusterBootOrder.java stack : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java stack : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/TestClusterBootOrder.java stack : Files : * /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/TestClusterBootOrder.java stack : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/TestClusterBootOrder.java On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.92.2, 0.94.0 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch, HBASE-5849_v4-0.92.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261339#comment-13261339 ] Hudson commented on HBASE-5849: --- Integrated in HBase-TRUNK-security #183 (See [https://builds.apache.org/job/HBase-TRUNK-security/183/]) HBASE-5849 On first cluster startup, RS aborts if root znode is not available; REVERT (Revision 1329560) Result = FAILURE stack : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestClusterBootOrder.java On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.92.2, 0.94.0 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch, HBASE-5849_v4-0.92.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261421#comment-13261421 ] Hudson commented on HBASE-5849: --- Integrated in HBase-TRUNK #2811 (See [https://builds.apache.org/job/HBase-TRUNK/2811/]) HBASE-5849 On first cluster startup, RS aborts if root znode is not available; REAPPLY (Revision 1330116) Result = FAILURE stack : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestClusterBootOrder.java On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.92.2, 0.94.0 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch, HBASE-5849_v4-0.92.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261423#comment-13261423 ] Hudson commented on HBASE-5849: --- Integrated in HBase-0.94 #148 (See [https://builds.apache.org/job/HBase-0.94/148/]) HBASE-5849 On first cluster startup, RS aborts if root znode is not available; REAPPLY (Revision 1330117) Result = SUCCESS stack : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/TestClusterBootOrder.java On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.92.2, 0.94.0 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch, HBASE-5849_v4-0.92.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261475#comment-13261475 ] Hudson commented on HBASE-5849: --- Integrated in HBase-0.92 #390 (See [https://builds.apache.org/job/HBase-0.92/390/]) HBASE-5849 On first cluster startup, RS aborts if root znode is not available; REAPPLY (Revision 1330119) HBASE-5849 On first cluster startup, RS aborts if root znode is not available; REAPPLY (Revision 1330118) Result = FAILURE stack : Files : * /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/TestClusterBootOrder.java stack : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.92.2, 0.94.0 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch, HBASE-5849_v4-0.92.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261896#comment-13261896 ] Hudson commented on HBASE-5849: --- Integrated in HBase-0.94-security #21 (See [https://builds.apache.org/job/HBase-0.94-security/21/]) HBASE-5849 On first cluster startup, RS aborts if root znode is not available; REAPPLY (Revision 1330117) Result = FAILURE stack : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/TestClusterBootOrder.java On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.92.2, 0.94.0 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch, HBASE-5849_v4-0.92.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262137#comment-13262137 ] Enis Soztutar commented on HBASE-5849: -- Thanks all for pursuing this. From the failed Hudson builds: https://builds.apache.org/job/HBase-TRUNK-security/183/ https://builds.apache.org/job/HBase-TRUNK/2811/testReport/ https://builds.apache.org/job/HBase-0.92/390/ https://builds.apache.org/job/HBase-0.94-security/21/ None of the tests seem related. @Stack, for EvictionThread, I guess since the git repo is falling behind, I might not have your recent changes (I'm so lazy to checkout from svn). Although I saw also some other daemon threads (like a couple of IPC Client threads, etc). Let me dig into that later, and see if we can improve on that. I'll open another jira if I find anything interesting. On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.92.2, 0.94.0 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch, HBASE-5849_v4-0.92.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13262276#comment-13262276 ] Hudson commented on HBASE-5849: --- Integrated in HBase-TRUNK-security #184 (See [https://builds.apache.org/job/HBase-TRUNK-security/184/]) HBASE-5849 On first cluster startup, RS aborts if root znode is not available; REAPPLY (Revision 1330116) Result = FAILURE stack : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestClusterBootOrder.java On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.92.2, 0.94.0 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch, HBASE-5849_v4-0.92.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260253#comment-13260253 ] Hudson commented on HBASE-5849: --- Integrated in HBase-0.92 #387 (See [https://builds.apache.org/job/HBase-0.92/387/]) HBASE-5849 On first cluster startup, RS aborts if root znode is not available (Revision 1329548) Result = ABORTED stack : Files : * /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/TestClusterBootOrder.java On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.92.2, 0.94.0 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260255#comment-13260255 ] stack commented on HBASE-5849: -- I killed all running builds in case they'd run into this hang. On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.92.2, 0.94.0 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260256#comment-13260256 ] stack commented on HBASE-5849: -- Enis, might taking a look at this? On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.92.2, 0.94.0 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260262#comment-13260262 ] Hudson commented on HBASE-5849: --- Integrated in HBase-TRUNK-security #182 (See [https://builds.apache.org/job/HBase-TRUNK-security/182/]) HBASE-5849 On first cluster startup, RS aborts if root znode is not available (Revision 1329527) Result = SUCCESS stack : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestClusterBootOrder.java On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.92.2, 0.94.0 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260295#comment-13260295 ] Hudson commented on HBASE-5849: --- Integrated in HBase-TRUNK #2804 (See [https://builds.apache.org/job/HBase-TRUNK/2804/]) HBASE-5849 On first cluster startup, RS aborts if root znode is not available; REVERT (Revision 1329560) Result = SUCCESS stack : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestClusterBootOrder.java On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.92.2, 0.94.0 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260308#comment-13260308 ] Hudson commented on HBASE-5849: --- Integrated in HBase-0.94 #141 (See [https://builds.apache.org/job/HBase-0.94/141/]) HBASE-5849 On first cluster startup, RS aborts if root znode is not available; REVERT (Revision 1329561) Result = FAILURE stack : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/TestClusterBootOrder.java On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.92.2, 0.94.0 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260358#comment-13260358 ] Hudson commented on HBASE-5849: --- Integrated in HBase-0.92 #388 (See [https://builds.apache.org/job/HBase-0.92/388/]) HBASE-5849 On first cluster startup, RS aborts if root znode is not available; REVERT (Revision 1329562) Result = SUCCESS stack : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/TestClusterBootOrder.java On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.92.2, 0.94.0 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260791#comment-13260791 ] Enis Soztutar commented on HBASE-5849: -- Interesting that Hudson did not report any test failures. let me dig down to this. On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.92.2, 0.94.0 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260820#comment-13260820 ] stack commented on HBASE-5849: -- @Enis Agreed. I tried this before applying too. On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.92.2, 0.94.0 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261224#comment-13261224 ] Hudson commented on HBASE-5849: --- Integrated in HBase-0.94-security #20 (See [https://builds.apache.org/job/HBase-0.94-security/20/]) HBASE-5849 On first cluster startup, RS aborts if root znode is not available; REVERT (Revision 1329561) HBASE-5849 On first cluster startup, RS aborts if root znode is not available (Revision 1329528) Result = SUCCESS stack : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/TestClusterBootOrder.java stack : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/TestClusterBootOrder.java On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.92.2, 0.94.0 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch, HBASE-5849_v4-0.92.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261253#comment-13261253 ] Hadoop QA commented on HBASE-5849: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12524099/HBASE-5849_v4.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 5 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.TestRegionRebalancing org.apache.hadoop.hbase.replication.TestReplication Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1637//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1637//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1637//console This message is automatically generated. On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.92.2, 0.94.0 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch, HBASE-5849_v4-0.92.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261308#comment-13261308 ] Zhihong Yu commented on HBASE-5849: --- I looped TestClusterBootOrder using patch v4 5 times and didn't see hanging test. On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.92.2, 0.94.0 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch, HBASE-5849_v4-0.92.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261314#comment-13261314 ] stack commented on HBASE-5849: -- @Enis LruBlockCache.EvictionThread should be cleaned up on cluster shutdown? I thought I fixed that a day or so ago. On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.92.2, 0.94.0 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch, HBASE-5849_v4-0.92.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13261316#comment-13261316 ] Lars Hofhansl commented on HBASE-5849: -- TestRegionRebalancing is unrelated (see HBASE-5848). TestReplication passes for me locally with v4 applied. On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.92.2, 0.94.0 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch, HBASE-5849_v4-0.92.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch, HBASE-5849_v4.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259927#comment-13259927 ] Enis Soztutar commented on HBASE-5849: -- Upon inspecting further, it seems the patch for HBASE-4138 added the check for the base server at region server start code. While it makes sense to check for znode.parent from the client side, we should not do that for the regionserver. On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259928#comment-13259928 ] stack commented on HBASE-5849: -- Sounds good Enis. What should RS do then? On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259937#comment-13259937 ] stack commented on HBASE-5849: -- Patch lgtm. On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Attachments: HBASE-5849_v1.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13259983#comment-13259983 ] Hadoop QA commented on HBASE-5849: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12523865/HBASE-5849_v1.patch against trunk revision . +1 @author. The patch does not contain any @author tags. -1 tests included. The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. -1 core tests. The patch failed these unit tests: org.apache.hadoop.hbase.io.hfile.TestForceCacheImportantBlocks org.apache.hadoop.hbase.util.TestProcessBasedCluster Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1617//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1617//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1617//console This message is automatically generated. On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Attachments: HBASE-5849_v1.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260086#comment-13260086 ] Hadoop QA commented on HBASE-5849: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12523888/HBASE-5849_v2.patch against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 2 new or modified tests. +1 javadoc. The javadoc tool did not generate any warning messages. +1 javac. The applied patch does not increase the total number of javac compiler warnings. -1 findbugs. The patch appears to introduce 3 new Findbugs (version 1.3.9) warnings. +1 release audit. The applied patch does not increase the total number of release audit warnings. +1 core tests. The patch passed unit tests in . Test results: https://builds.apache.org/job/PreCommit-HBASE-Build/1620//testReport/ Findbugs warnings: https://builds.apache.org/job/PreCommit-HBASE-Build/1620//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1620//console This message is automatically generated. On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Attachments: HBASE-5849_v1.patch, HBASE-5849_v2.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260142#comment-13260142 ] Hadoop QA commented on HBASE-5849: -- -1 overall. Here are the results of testing the latest attachment http://issues.apache.org/jira/secure/attachment/12523905/5849v3.txt against trunk revision . +1 @author. The patch does not contain any @author tags. +1 tests included. The patch appears to include 3 new or modified tests. -1 patch. The patch command could not apply the patch. Console output: https://builds.apache.org/job/PreCommit-HBASE-Build/1622//console This message is automatically generated. On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260160#comment-13260160 ] Hudson commented on HBASE-5849: --- Integrated in HBase-0.92 #386 (See [https://builds.apache.org/job/HBase-0.92/386/]) HBASE-5849 On first cluster startup, RS aborts if root znode is not available (Revision 1329530) Result = FAILURE stack : Files : * /hbase/branches/0.92/CHANGES.txt * /hbase/branches/0.92/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.92/src/test/java/org/apache/hadoop/hbase/TestClusterBootOrder.java On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.92.2, 0.94.0 Attachments: 5849.jstack, 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260171#comment-13260171 ] Hudson commented on HBASE-5849: --- Integrated in HBase-TRUNK #2803 (See [https://builds.apache.org/job/HBase-TRUNK/2803/]) HBASE-5849 On first cluster startup, RS aborts if root znode is not available (Revision 1329527) Result = SUCCESS stack : Files : * /hbase/trunk/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/trunk/src/test/java/org/apache/hadoop/hbase/TestClusterBootOrder.java On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.92.2, 0.94.0 Attachments: 5849.jstack, 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260173#comment-13260173 ] Hudson commented on HBASE-5849: --- Integrated in HBase-0.94 #139 (See [https://builds.apache.org/job/HBase-0.94/139/]) HBASE-5849 On first cluster startup, RS aborts if root znode is not available (Revision 1329528) Result = FAILURE stack : Files : * /hbase/branches/0.94/src/main/java/org/apache/hadoop/hbase/regionserver/HRegionServer.java * /hbase/branches/0.94/src/test/java/org/apache/hadoop/hbase/TestClusterBootOrder.java On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.92.2, 0.94.0 Attachments: 5849.jstack, 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260208#comment-13260208 ] stack commented on HBASE-5849: -- I tried it before committing and it passed then. I just tried it on trunk now: {code} --- T E S T S --- Running org.apache.hadoop.hbase.TestClusterBootOrder 2012-04-23 21:27:45.213 java[97823:d007] Unable to load realm info from SCDynamicStore Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 19.727 sec Results : Tests run: 2, Failures: 0, Errors: 0, Skipped: 0 [INFO] [INFO] --- maven-surefire-plugin:2.10:test (secondPartTestsExecution) @ hbase --- [INFO] Tests are skipped. [INFO] [INFO] BUILD SUCCESS [INFO] [INFO] Total time: 34.313s [INFO] Finished at: Mon Apr 23 21:28:02 PDT 2012 [INFO] Final Memory: 21M/81M [INFO] {code} On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.92.2, 0.94.0 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260246#comment-13260246 ] stack commented on HBASE-5849: -- There is something wrong now. This test won't complete for me (though it has previous). I thought it the subsequent commit: {code} r1329555 | larsh | 2012-04-23 22:12:45 -0700 (Mon, 23 Apr 2012) | 1 line Refuse operations from Admin before master is initialized - fix for all branches {code} ..that was bringing on the problem but removing that, its still not completing. I poked around in debugger and was getting an NPE in reportForDuty after master came up because this.hbaseMaster was null; we were failing allocating the Interface (hard to trace because toString would throw its on exception). For now backing this out. On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.92.2, 0.94.0 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260247#comment-13260247 ] stack commented on HBASE-5849: -- I mean, it even passed hadoopqa above apart from my testing. Backing it out though... its ugly hang when it happens. On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.92.2, 0.94.0 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HBASE-5849) On first cluster startup, RS aborts if root znode is not available
[ https://issues.apache.org/jira/browse/HBASE-5849?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13260248#comment-13260248 ] stack commented on HBASE-5849: -- So, yes, I'm seeing what Ted reports above. On first cluster startup, RS aborts if root znode is not available -- Key: HBASE-5849 URL: https://issues.apache.org/jira/browse/HBASE-5849 Project: HBase Issue Type: Bug Components: master, regionserver, zookeeper Affects Versions: 0.92.2, 0.96.0, 0.94.1 Reporter: Enis Soztutar Assignee: Enis Soztutar Fix For: 0.92.2, 0.94.0 Attachments: 5849v3.txt, HBASE-5849_v1.patch, HBASE-5849_v2.patch When launching a fresh new cluster, the master has to be started first, which might create race conditions for starting master and rs at the same time. Master startup code is smt like this: - establish zk connection - create root znodes in zk (/hbase) - create ephemeral node for master /hbase/master, Region server start up code is smt like this: - establish zk connection - check whether the root znode (/hbase) is there. If not, shutdown. - wait for the master to create znodes /hbase/master So, the problem is on the very first launch of the cluster, RS aborts to start since /hbase znode might not have been created yet (only the master creates it if needed). Since /hbase/ is not deleted on cluster shutdown, on subsequent cluster starts, it does not matter which order the servers are started. So this affects only first launchs. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira