[
https://issues.apache.org/jira/browse/HBASE-6046?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13287467#comment-13287467
]
Ashutosh Jindal commented on HBASE-6046:
----------------------------------------
Please check the second testcase added
testLogSplittingAfterMasterRecoveryDueToZKExpiry() .If the testcase is run
without the patch , stackOverFlow exception is thrown.
{code}
java.lang.StackOverflowError
at java.lang.System.getProperty(System.java:647)
at sun.security.action.GetPropertyAction.run(GetPropertyAction.java:67)
at sun.security.action.GetPropertyAction.run(GetPropertyAction.java:32)
at java.security.AccessController.doPrivileged(Native Method)
at java.io.PrintWriter.<init>(PrintWriter.java:78)
at java.io.PrintWriter.<init>(PrintWriter.java:62)
at
org.apache.log4j.DefaultThrowableRenderer.render(DefaultThrowableRenderer.java:58)
at
org.apache.log4j.spi.ThrowableInformation.getThrowableStrRep(ThrowableInformation.java:87)
at
org.apache.log4j.spi.LoggingEvent.getThrowableStrRep(LoggingEvent.java:413)
at org.apache.log4j.WriterAppender.subAppend(WriterAppender.java:313)
at org.apache.log4j.WriterAppender.append(WriterAppender.java:162)
at org.apache.log4j.AppenderSkeleton.doAppend(AppenderSkeleton.java:251)
at
org.apache.log4j.helpers.AppenderAttachableImpl.appendLoopOnAppenders(AppenderAttachableImpl.java:66)
at org.apache.log4j.Category.callAppenders(Category.java:206)
at org.apache.log4j.Category.forcedLog(Category.java:391)
at org.apache.log4j.Category.log(Category.java:856)
at org.slf4j.impl.Log4jLoggerAdapter.error(Log4jLoggerAdapter.java:485)
at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:623)
at
org.apache.zookeeper.ClientCnxn$EventThread.queuePacket(ClientCnxn.java:477)
at org.apache.zookeeper.ClientCnxn.finishPacket(ClientCnxn.java:640)
at org.apache.zookeeper.ClientCnxn.conLossPacket(ClientCnxn.java:658)
at org.apache.zookeeper.ClientCnxn.queuePacket(ClientCnxn.java:1274)
at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:975)
at
org.apache.hadoop.hbase.master.SplitLogManager.deleteNode(SplitLogManager.java:626)
at
org.apache.hadoop.hbase.master.SplitLogManager.access$17(SplitLogManager.java:620)
at
org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback.processResult(SplitLogManager.java:1104)
at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:619)
at
org.apache.zookeeper.ClientCnxn$EventThread.queuePacket(ClientCnxn.java:477)
at org.apache.zookeeper.ClientCnxn.finishPacket(ClientCnxn.java:640)
at org.apache.zookeeper.ClientCnxn.conLossPacket(ClientCnxn.java:658)
at org.apache.zookeeper.ClientCnxn.queuePacket(ClientCnxn.java:1274)
at org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:975)
at
org.apache.hadoop.hbase.master.SplitLogManager.deleteNode(SplitLogManager.java:626)
at
org.apache.hadoop.hbase.master.SplitLogManager.access$17(SplitLogManager.java:620)
at
org.apache.hadoop.hbase.master.SplitLogManager$DeleteAsyncCallback.processResult(SplitLogManager.java:1104)
at
org.apache.zookeeper.ClientCnxn$EventThread.processEvent(ClientCnxn.java:619)
at
org.apache.zookeeper.ClientCnxn$EventThread.queuePacket(ClientCnxn.java:477)
{code}
This is coming because the listener for splitLogManager is not registered after
the master recovers from expired zk session.
> Master retry on ZK session expiry causes inconsistent region assignments.
> -------------------------------------------------------------------------
>
> Key: HBASE-6046
> URL: https://issues.apache.org/jira/browse/HBASE-6046
> Project: HBase
> Issue Type: Bug
> Components: master
> Affects Versions: 0.92.1, 0.94.0
> Reporter: Gopinathan A
> Assignee: ramkrishna.s.vasudevan
> Attachments: HBASE_6046_0.94.patch
>
>
> 1> ZK Session timeout in the hmaster leads to bulk assignment though all the
> RSs are online.
> 2> While doing bulk assignment, if the master again goes down & restart(or
> backup comes up) all the node created in the ZK will now be tried to reassign
> to the new RSs. This is leading to double assignment.
> we had 2800 regions, among this 1900 region got double assignment, taking the
> region count to 4700.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira