[ https://issues.apache.org/jira/browse/HBASE-4857?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ted Yu updated HBASE-4857: -------------------------- Status: Patch Available (was: Open) > Recursive loop on KeeperException in > AuthenticationTokenSecretManager/ZKLeaderManager > ------------------------------------------------------------------------------------- > > Key: HBASE-4857 > URL: https://issues.apache.org/jira/browse/HBASE-4857 > Project: HBase > Issue Type: Bug > Components: security > Affects Versions: 0.92.0, 0.94.0 > Reporter: Gary Helmling > Fix For: 0.92.0 > > Attachments: HBASE-4857.patch > > > Looking through stack traces for {{TestMasterFailover}}, I see a case where > the leader {{AuthenticationTokenSecretManager}} can get into a recursive loop > when a {{KeeperException}} is encountered: > {noformat} > Thread-1-EventThread" daemon prio=10 tid=0x00007f9fb47b2800 nid=0x77f6 > waiting on condition [0x00007f9fab376000] > java.lang.Thread.State: TIMED_WAITING (sleeping) > at java.lang.Thread.sleep(Native Method) > at java.lang.Thread.sleep(Thread.java:302) > at java.util.concurrent.TimeUnit.sleep(TimeUnit.java:328) > at > org.apache.hadoop.hbase.util.RetryCounter.sleepUntilNextRetry(RetryCounter.java:55) > at > org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.exists(RecoverableZooKeeper.java:206) > at > org.apache.hadoop.hbase.zookeeper.ZKUtil.createAndFailSilent(ZKUtil.java:891) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.createBaseZNodes(ZooKeeperWatcher.java:161) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.<init>(ZooKeeperWatcher.java:154) > at > org.apache.hadoop.hbase.master.HMaster.tryRecoveringExpiredZKSession(HMaster.java:1397) > at org.apache.hadoop.hbase.master.HMaster.abortNow(HMaster.java:1435) > at org.apache.hadoop.hbase.master.HMaster.abort(HMaster.java:1374) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.abort(ZooKeeperWatcher.java:450) > at > org.apache.hadoop.hbase.zookeeper.ZKLeaderManager.stepDownAsLeader(ZKLeaderManager.java:166) > at > org.apache.hadoop.hbase.security.token.AuthenticationTokenSecretManager$LeaderElector.stop(AuthenticationTokenSecretManager.java:293) > at > org.apache.hadoop.hbase.zookeeper.ZKLeaderManager.stepDownAsLeader(ZKLeaderManager.java:167) > at > org.apache.hadoop.hbase.security.token.AuthenticationTokenSecretManager$LeaderElector.stop(AuthenticationTokenSecretManager.java:293) > at > org.apache.hadoop.hbase.zookeeper.ZKLeaderManager.stepDownAsLeader(ZKLeaderManager.java:167) > at > org.apache.hadoop.hbase.security.token.AuthenticationTokenSecretManager$LeaderElector.stop(AuthenticationTokenSecretManager.java:293) > at > org.apache.hadoop.hbase.zookeeper.ZKLeaderManager.handleLeaderChange(ZKLeaderManager.java:96) > at > org.apache.hadoop.hbase.zookeeper.ZKLeaderManager.nodeDeleted(ZKLeaderManager.java:78) > at > org.apache.hadoop.hbase.zookeeper.ZooKeeperWatcher.process(ZooKeeperWatcher.java:286) > at > org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:521) > at > org.apache.zookeeper.ClientCnxn$EventThread.run(ClientCnxn.java:497) > {noformat} > The {{KeeperException}} causes {{ZKLeaderManager}} to call > {{AuthenticationTokenSecretManager$LeaderElector.stop()}}, which calls > {{ZKLeaderManager.stepDownAsLeader()}}, which will encounter another > {{KeeperException}}, and so on... -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira