[ https://issues.apache.org/jira/browse/ZOOKEEPER-362?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Patrick Hunt reassigned ZOOKEEPER-362: -------------------------------------- Assignee: Flavio Paiva Junqueira > Issues with FLENewEpochTest > --------------------------- > > Key: ZOOKEEPER-362 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-362 > Project: Zookeeper > Issue Type: Bug > Affects Versions: 3.1.1 > Reporter: Flavio Paiva Junqueira > Assignee: Flavio Paiva Junqueira > Fix For: 3.2.0 > > Attachments: ZOOKEEPER-362.patch, ZOOKEEPER-362.patch > > > I have been able to identify two reasons that cause FLENewEpochTest to fail: > 1- There is a race condition that is triggered when two peers try to > establish a connection to each other for leader election. Basically, if they > start roughly at the same time, the server with highest id will try to open > two connections. The two competing connections will lead to one notification > message to be lost. This message happens to be critical for this two process > scenario; > 2- The code to shut down a peer is not working well with the unit tests. For > this particular unit test, we need to be able to shut down a peer completely > to check the situation the test tries to reproduce. However, it seems that in > some runs timing causes the other peers to believe it is still alive, and end > up electing it. This peer, however, eventually shuts down and leader election > fails. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.