[
https://issues.apache.org/jira/browse/ZOOKEEPER-1733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13717746#comment-13717746
]
Jeffrey Zhong commented on ZOOKEEPER-1733:
------------------------------------------
[~enis] found that cause. Basically windows run of testTripleElection took more
than 10 secs. After bumping up waitCounter to 200, FLETest passes consistently.
I'll try to create a patch to port zookeeper-1292 to 3.4. Meanwhile I'll check
why windows run takes longer than linux and may open another JIRA. Thanks.
> FLETest#testLE is flaky on windows boxes
> ----------------------------------------
>
> Key: ZOOKEEPER-1733
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1733
> Project: ZooKeeper
> Issue Type: Bug
> Affects Versions: 3.4.5
> Reporter: Jeffrey Zhong
>
> FLETest#testLE fail intermittently on windows boxes. The reason is that in
> LEThread#run() we have:
> {code}
> if(leader == i){
> synchronized(finalObj){
> successCount++;
> if(successCount > (count/2))
> finalObj.notify();
> }
> break;
> }
> {code}
> Basically once we have a confirmed leader, the leader thread dies due to the
> "break" of while loop.
> While in the verification step, we check if the leader thread alive or not as
> following:
> {code}
> if(threads.get((int) leader).isAlive()){
> Assert.fail("Leader hasn't joined: " + leader);
> }
> {code}
> On windows boxes, the above verification step fails frequently because leader
> thread most likely already exits.
> Do we know why we have the leader alive verification step only lead thread
> can bump up successCount >= count/2?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira