[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-1870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13950432#comment-13950432
 ] 

Michi Mutsuzaki commented on ZOOKEEPER-1870:
--------------------------------------------

I can't reproduce it now, but I think there was a case where the quorum peer 
incorrectly became a leader after shutdown got called if the proposedLeader 
wasn't set to -1. I'm guessing it could happen if shutdown() gets called right 
before this block of code gets executed. Maybe there is a way to shutdown the 
leader election more cleanly?

{noformat}
/*
 * This predicate is true once we don't read any new
 * relevant message from the reception queue
 */
if (n == null) {
    self.setPeerState((proposedLeader == self.getId()) ?
            ServerState.LEADING: learningState());

    Vote endVote = new Vote(proposedLeader,
            proposedZxid, proposedEpoch);
    leaveInstance(endVote);
    return endVote;
}
{noformat}

Yes, I think we should fix this in 3.4. I'll upload a separate patch for 3.4.

> flakey test in StandaloneDisabledTest.startSingleServerTest
> -----------------------------------------------------------
>
>                 Key: ZOOKEEPER-1870
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1870
>             Project: ZooKeeper
>          Issue Type: Bug
>          Components: tests
>    Affects Versions: 3.5.0
>            Reporter: Patrick Hunt
>            Assignee: Helen Hastings
>            Priority: Blocker
>         Attachments: ZOOKEEPER-1870.patch, ZOOKEEPER-1870.patch, test.log
>
>
> I'm seeing lots of the following failure. Seems like a flakey test (passes 
> every so often).
> {noformat}
> junit.framework.AssertionFailedError: client could not connect to 
> reestablished quorum: giving up after 30+ seconds.
>       at 
> org.apache.zookeeper.test.ReconfigTest.testNormalOperation(ReconfigTest.java:143)
>       at 
> org.apache.zookeeper.server.quorum.StandaloneDisabledTest.startSingleServerTest(StandaloneDisabledTest.java:75)
>       at 
> org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52)
> {noformat}
> I've found 3 problems:
> 1. QuorumCnxManager.Listener.run() leaks the socket depending on when the 
> shutdown flag gets set.
> 2. QuorumCnxManager.halt() doesn't wait for the listener to terminate.
> 3. QuorumPeer.shuttingDownLE flag doesn't get reset when restarting the 
> leader election.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to