[
https://issues.apache.org/jira/browse/ZOOKEEPER-1870?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13936282#comment-13936282
]
Michi Mutsuzaki commented on ZOOKEEPER-1870:
--------------------------------------------
Ok I think I know what the problem is. There is race between
QuorumCnxManager.Listener.run() and QuorumCnxManager.Listener.halt() that
causes the socket to leak.
1. {{QuorumCnxManager.Listener.run()}} goes into the while loop
{{while((!shutdown) && (numRetries < 3))}}
2. {{QuorumCnxManager.halt()}} gets called, sets {{shutdown}} to {{true}} and
calls {{QuorumCnxManager.Listener.halt()}}.
3. {{QuorumCnxManager.Listener.halt()}} closes the socket.
4. {{QuorumCnxManager.Listener.run()}} binds the socket and breaks out of the
while loop since the shutdown flag is set.
I'll upload a patch.
> flakey test in StandaloneDisabledTest.startSingleServerTest
> -----------------------------------------------------------
>
> Key: ZOOKEEPER-1870
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1870
> Project: ZooKeeper
> Issue Type: Bug
> Components: tests
> Affects Versions: 3.5.0
> Reporter: Patrick Hunt
> Assignee: Helen Hastings
> Priority: Critical
>
> I'm seeing lots of the following failure. Seems like a flakey test (passes
> every so often).
> {noformat}
> junit.framework.AssertionFailedError: client could not connect to
> reestablished quorum: giving up after 30+ seconds.
> at
> org.apache.zookeeper.test.ReconfigTest.testNormalOperation(ReconfigTest.java:143)
> at
> org.apache.zookeeper.server.quorum.StandaloneDisabledTest.startSingleServerTest(StandaloneDisabledTest.java:75)
> at
> org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.2#6252)