[
https://issues.apache.org/jira/browse/ZOOKEEPER-1733?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13715883#comment-13715883
]
Jeffrey Zhong commented on ZOOKEEPER-1733:
------------------------------------------
{bq}
I haven't had a chance to run on a windows box yet, but the message you have
above says that it didn't get a majority, so it's not that the leader hasn't
joined
{bq}
In trunk, it failed before the test can reach "Leader hasn't joined"
verification because assert failed in step "Fewer than a a majority has joined".
The reason failed on windows is that windows quit a thread as soon as the
thread function exits while it seems doesn't happen on a linux box.
> FLETest#testLE is flaky on windows boxes
> ----------------------------------------
>
> Key: ZOOKEEPER-1733
> URL: https://issues.apache.org/jira/browse/ZOOKEEPER-1733
> Project: ZooKeeper
> Issue Type: Bug
> Affects Versions: 3.4.5
> Reporter: Jeffrey Zhong
>
> FLETest#testLE fail intermittently on windows boxes. The reason is that in
> LEThread#run() we have:
> {code}
> if(leader == i){
> synchronized(finalObj){
> successCount++;
> if(successCount > (count/2))
> finalObj.notify();
> }
> break;
> }
> {code}
> Basically once we have a confirmed leader, the leader thread dies due to the
> "break" of while loop.
> While in the verification step, we check if the leader thread alive or not as
> following:
> {code}
> if(threads.get((int) leader).isAlive()){
> Assert.fail("Leader hasn't joined: " + leader);
> }
> {code}
> On windows boxes, the above verification step fails frequently because leader
> thread most likely already exits.
> Do we know why we have the leader alive verification step only lead thread
> can bump up successCount >= count/2?
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira