[jira] [Updated] (ZOOKEEPER-2953) Flaky Test: testNoLogBeforeLeaderEstablishment

Abraham Fine (JIRA) Thu, 14 Dec 2017 08:45:15 -0800

     [ 
https://issues.apache.org/jira/browse/ZOOKEEPER-2953?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Abraham Fine updated ZOOKEEPER-2953:
------------------------------------
    Description: 
testNoLogBeforeLeaderEstablishment has been flaky on 3.4, 3.5, and master for 
quite awhile. My understanding is that the purpose of the test is to make sure 
that a server receives support from the quorum before changing the epoch and 
acting as leader. 

There are a couple issues with the test in its current state. First, the 
assertions the test makes are not always true. It is possible, if the zookeeper 
database is not cleared, for a follower to be ahead of a leader when the quorum 
is shutdown. That follower will then likely become leader when the quorum is 
restarted. This is the cause of the flaky behavior. Second, the test does not 
appear to create the conditions it wants to test for. Since, ZOOKEEPER-335 
(specifically the ZOOKEEPER-1081 subtask) we take the epoch into consideration 
in {{FastLeaderElection}} so the test no longer "believes it is the leader once 
it recovers".

After discussing the issue offline with [~phunt] we decided it would still be 
valuable to test the situation where a server is elected leader without the 
support of the quorum. So I removed {{testNoLogBeforeLeaderEstablishment}} and 
created a new test called {{testElectionFraud}}.

  was:
testNoLogBeforeLeaderEstablishment has been flaky on 3.4, 3.5, and master for 
quite awhile. My understanding is that the purpose of the test is to make sure 
that a server receives support from the quorum before changing the epoch and 
acting as leader. 

There are a couple issues with the test in its current state. First, the 
assertions the test makes are not always true. It is possible, if the zookeeper 
database is not cleared, for a follower to be ahead of a leader when the quorum 
is shutdown. That follower will then likely become leader when the quorum is 
restarted. This is the cause of the flaky behavior. Second, the test does not 
appear to create the conditions it wants to test for. Since, ZOOKEEPER-335 
(specifically the ZOOKEEPER-1081 subtask) we take the epoch into consideration 
in {{FastLeaderElection}} so the test no longer "believes it is the leader once 
it recovers".

After discussing the issue offline with [~phunt] we decided it would still be 
valuable to test the situation where a server is elected leader without the 
support of the quorum. So I removed 


> Flaky Test: testNoLogBeforeLeaderEstablishment
> ----------------------------------------------
>
>                 Key: ZOOKEEPER-2953
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-2953
>             Project: ZooKeeper
>          Issue Type: Bug
>    Affects Versions: 3.5.3, 3.4.11, 3.6.0
>            Reporter: Abraham Fine
>            Assignee: Abraham Fine
>
> testNoLogBeforeLeaderEstablishment has been flaky on 3.4, 3.5, and master for 
> quite awhile. My understanding is that the purpose of the test is to make 
> sure that a server receives support from the quorum before changing the epoch 
> and acting as leader. 
> There are a couple issues with the test in its current state. First, the 
> assertions the test makes are not always true. It is possible, if the 
> zookeeper database is not cleared, for a follower to be ahead of a leader 
> when the quorum is shutdown. That follower will then likely become leader 
> when the quorum is restarted. This is the cause of the flaky behavior. 
> Second, the test does not appear to create the conditions it wants to test 
> for. Since, ZOOKEEPER-335 (specifically the ZOOKEEPER-1081 subtask) we take 
> the epoch into consideration in {{FastLeaderElection}} so the test no longer 
> "believes it is the leader once it recovers".
> After discussing the issue offline with [~phunt] we decided it would still be 
> valuable to test the situation where a server is elected leader without the 
> support of the quorum. So I removed {{testNoLogBeforeLeaderEstablishment}} 
> and created a new test called {{testElectionFraud}}.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Updated] (ZOOKEEPER-2953) Flaky Test: testNoLogBeforeLeaderEstablishment

Reply via email to