[
https://issues.apache.org/jira/browse/ZOOKEEPER-1870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13950432#comment-13950432
]
Michi Mutsuzaki commented on ZOOKEEPER-1870:
--------------------------------------------
I can't reproduce it now, but I think there was a case where the quorum peer
incorrectly became a leader after shutdown got called if the proposedLeader
wasn't set to -1. I'm guessing it could happen if shutdown() gets called right
before this block of code gets executed. Maybe there is a way to shutdown the
leader election more cleanly?
{noformat}
/*
* This predicate is true once we don't read any new
* relevant message from the reception queue
*/
if (n == null) {
self.setPeerState((proposedLeader == self.getId()) ?
ServerState.LEADING: learningState());
Vote endVote = new Vote(proposedLeader,
proposedZxid, proposedEpoch);
leaveInstance(endVote);
return endVote;
}
{noformat}
Yes, I think we should fix this in 3.4. I'll upload a separate patch for 3.4.
> flakey test in StandaloneDisabledTest.startSingleServerTest
> -----------------------------------------------------------
>
> Key: ZOOKEEPER-1870
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1870
> Project: ZooKeeper
> Issue Type: Bug
> Components: tests
> Affects Versions: 3.5.0
> Reporter: Patrick Hunt
> Assignee: Helen Hastings
> Priority: Blocker
> Attachments: ZOOKEEPER-1870.patch, ZOOKEEPER-1870.patch, test.log
>
>
> I'm seeing lots of the following failure. Seems like a flakey test (passes
> every so often).
> {noformat}
> junit.framework.AssertionFailedError: client could not connect to
> reestablished quorum: giving up after 30+ seconds.
> at
> org.apache.zookeeper.test.ReconfigTest.testNormalOperation(ReconfigTest.java:143)
> at
> org.apache.zookeeper.server.quorum.StandaloneDisabledTest.startSingleServerTest(StandaloneDisabledTest.java:75)
> at
> org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52)
> {noformat}
> I've found 3 problems:
> 1. QuorumCnxManager.Listener.run() leaks the socket depending on when the
> shutdown flag gets set.
> 2. QuorumCnxManager.halt() doesn't wait for the listener to terminate.
> 3. QuorumPeer.shuttingDownLE flag doesn't get reset when restarting the
> leader election.
--
This message was sent by Atlassian JIRA
(v6.2#6252)