[
https://issues.apache.org/jira/browse/HBASE-5572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228872#comment-13228872
]
Hadoop QA commented on HBASE-5572:
----------------------------------
-1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12518238/5572.v2.patch
against trunk revision .
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 5 new or modified tests.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac
compiler warnings.
-1 findbugs. The patch appears to introduce 161 new Findbugs (version
1.3.9) warnings.
+1 release audit. The applied patch does not increase the total number of
release audit warnings.
+1 core tests. The patch passed unit tests in .
Test results:
https://builds.apache.org/job/PreCommit-HBASE-Build/1179//testReport/
Findbugs warnings:
https://builds.apache.org/job/PreCommit-HBASE-Build/1179//artifact/trunk/patchprocess/newPatchFindbugsWarnings.html
Console output:
https://builds.apache.org/job/PreCommit-HBASE-Build/1179//console
This message is automatically generated.
> KeeperException.SessionExpiredException management could be improved in Master
> ------------------------------------------------------------------------------
>
> Key: HBASE-5572
> URL: https://issues.apache.org/jira/browse/HBASE-5572
> Project: HBase
> Issue Type: Improvement
> Components: master
> Affects Versions: 0.96.0
> Reporter: nkeywal
> Assignee: nkeywal
> Priority: Minor
> Fix For: 0.96.0
>
> Attachments: 5572.v1.patch, 5572.v2.patch, 5572.v2.patch,
> 5572.v2.patch
>
>
> Synthesis:
> 1) TestMasterZKSessionRecovery distinguish two cases on
> SessionExpiredException. One is explicitly not managed. However, is seems
> that there is no reason for this.
> 2) The issue lies in ActiveMasterManager#blockUntilBecomingActiveMaster, a
> quite complex function, with a useless recursive call.
> 3) TestMasterZKSessionRecovery#testMasterZKSessionRecoverySuccess is
> equivalent to TestZooKeeper#testMasterSessionExpired
> 4) TestMasterZKSessionRecovery#testMasterZKSessionRecoveryFailure can be
> removed if we merge the two cases mentioned above.
> Changes are:
> 2) Changing ActiveMasterManager#blockUntilBecomingActiveMaster to have a
> single case and remove recursion
> 1) Removing TestMasterZKSessionRecovery
> Detailed justification:
> testMasterZKSessionRecoveryFailure says:
> {noformat}
> /**
> * Negative test of master recovery from zk session expiry.
> *
> * Starts with one master. Fakes the master zk session expired.
> * Ensures the master cannot recover the expired zk session since
> * the master zk node is still there.
> */
> public void testMasterZKSessionRecoveryFailure() throws Exception {
> MiniHBaseCluster cluster = TEST_UTIL.getHBaseCluster();
> HMaster m = cluster.getMaster();
> m.abort("Test recovery from zk session expired",
> new KeeperException.SessionExpiredException());
> assertTrue(m.isStopped());
> }
> {noformat}
> This tests works, i.e. the assertion is always verified.
> But do we really want this behavior?
> When looking at the code, we see that this what's happening is strange:
> - HMaster#abort calls Master#abortNow. If HMaster#abortNow returns false
> HMaster#abort stops the master.
> - HMaster#abortNow checks the exception type. As it's a
> SessionExpiredException it will try to recover, calling
> HMaster#tryRecoveringExpiredZKSession. If it cannot, it will return false
> (and that will make HMaster#abort stopping the master)
> - HMaster#tryRecoveringExpiredZKSession recreates a ZooKeeperConnection and
> then try to become the active master. If it cannot, it will return false (and
> that will make HMaster#abort stopping the master).
> - HMaster#becomeActiveMaster returns the result of
> ActiveMasterManager#blockUntilBecomingActiveMaster.
> blockUntilBecomingActiveMaster says it will return false if there is any
> error preventing it to become the active master.
> - ActiveMasterManager#blockUntilBecomingActiveMaster reads ZK for the master
> address. If it's the same port & host, it deletes the nodes, that will start
> a recursive call to blockUntilBecomingActiveMaster. This second call succeeds
> (we became the active master) and return true. This result is ignored by the
> first blockUntilBecomingActiveMaster: it return false (even if we actually
> became the active master), hence the whole suite call returns false and
> HMaster#abort stops the master.
> In other words, the comment says "Ensures the master cannot recover the
> expired zk session since the master zk node is still there." but we're
> actually doing a check just for this and deleting the node. If we were not
> ignoring the result, we would return "true", so we would not stop the master,
> so the test would fail.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira