[GitHub] zookeeper pull request #432: [WIP] ZOOKEEPER-2953: Flaky Test: testNoLogBefo...

2017-12-13 Thread anmolnar
Github user anmolnar commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/432#discussion_r156793172
  
--- Diff: 
src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerMainTest.java ---
@@ -335,6 +336,100 @@ public void testHighestZxidJoinLate() throws 
Exception {
 output[0], 2);
 }
 
+/**
+ * This test validates that if a quorum member determines that it is 
leader without the support of the rest of the
+ * quorum (the other members do not believe it to be the leader) it 
will stop attempting to lead and become a follower.
+ *
+ * @throws IOException
+ * @throws InterruptedException
+ */
+@Test
+public void testElectionFraud() throws IOException, 
InterruptedException {
+// capture QuorumPeer logging
+Layout layout = 
Logger.getRootLogger().getAppender("CONSOLE").getLayout();
+ByteArrayOutputStream os = new ByteArrayOutputStream();
+WriterAppender appender = new WriterAppender(layout, os);
+appender.setThreshold(Level.INFO);
+Logger qlogger = Logger.getLogger(QuorumPeer.class);
+qlogger.addAppender(appender);
+
+int numServers = 3;
+
+// used for assertions later
+boolean foundLeading = false;
+boolean foundLooking = false;
+boolean foundFollowing = false;
+
+try {
+  // spin up a quorum, we use a small ticktime to make the test 
run faster
+  Servers servers = LaunchServers(numServers, 500);
+
+  // find the leader
+  int trueLeader = -1;
+  for (int i = 0; i < numServers; i++) {
+if (servers.mt[i].main.quorumPeer.leader != null) {
+  trueLeader = i;
+}
+  }
+  Assert.assertTrue("There should be a leader", trueLeader >= 0);
+
+  // find a follower
+  int falseLeader = (trueLeader + 1) % numServers;
+  
Assert.assertTrue(servers.mt[falseLeader].main.quorumPeer.follower != null);
+
+  // to keep the quorum peer running and force it to go into the 
looking state, we kill leader election
+  // and close the connection to the leader
+  servers.mt[falseLeader].main.quorumPeer.electionAlg.shutdown();
+  
servers.mt[falseLeader].main.quorumPeer.follower.getSocket().close();
+
+  // wait for the falseLeader to disconnect
+  waitForOne(servers.zk[falseLeader], States.CONNECTING);
+
+  // convince falseLeader that it is the leader
+  
servers.mt[falseLeader].main.quorumPeer.setPeerState(QuorumPeer.ServerState.LEADING);
+
+  // provide time for the falseleader to realize no followers have 
connected
+  // (this is twice the timeout used in Leader#getEpochToPropose)
+  Thread.sleep(2 * 
servers.mt[falseLeader].main.quorumPeer.initLimit * 
servers.mt[falseLeader].main.quorumPeer.tickTime);
+
+  // Restart leader election
+  servers.mt[falseLeader].main.quorumPeer.startLeaderElection();
--- End diff --

I see. Hence the shutdown() call a few lines before.
Makes sense.


---


[GitHub] zookeeper pull request #432: [WIP] ZOOKEEPER-2953: Flaky Test: testNoLogBefo...

2017-12-13 Thread afine
Github user afine commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/432#discussion_r156781040
  
--- Diff: 
src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerMainTest.java ---
@@ -335,6 +336,100 @@ public void testHighestZxidJoinLate() throws 
Exception {
 output[0], 2);
 }
 
+/**
+ * This test validates that if a quorum member determines that it is 
leader without the support of the rest of the
+ * quorum (the other members do not believe it to be the leader) it 
will stop attempting to lead and become a follower.
+ *
+ * @throws IOException
+ * @throws InterruptedException
+ */
+@Test
+public void testElectionFraud() throws IOException, 
InterruptedException {
+// capture QuorumPeer logging
+Layout layout = 
Logger.getRootLogger().getAppender("CONSOLE").getLayout();
+ByteArrayOutputStream os = new ByteArrayOutputStream();
+WriterAppender appender = new WriterAppender(layout, os);
+appender.setThreshold(Level.INFO);
+Logger qlogger = Logger.getLogger(QuorumPeer.class);
+qlogger.addAppender(appender);
+
+int numServers = 3;
+
+// used for assertions later
+boolean foundLeading = false;
+boolean foundLooking = false;
+boolean foundFollowing = false;
+
+try {
+  // spin up a quorum, we use a small ticktime to make the test 
run faster
+  Servers servers = LaunchServers(numServers, 500);
+
+  // find the leader
+  int trueLeader = -1;
+  for (int i = 0; i < numServers; i++) {
+if (servers.mt[i].main.quorumPeer.leader != null) {
+  trueLeader = i;
+}
+  }
+  Assert.assertTrue("There should be a leader", trueLeader >= 0);
+
+  // find a follower
+  int falseLeader = (trueLeader + 1) % numServers;
+  
Assert.assertTrue(servers.mt[falseLeader].main.quorumPeer.follower != null);
+
+  // to keep the quorum peer running and force it to go into the 
looking state, we kill leader election
+  // and close the connection to the leader
+  servers.mt[falseLeader].main.quorumPeer.electionAlg.shutdown();
+  
servers.mt[falseLeader].main.quorumPeer.follower.getSocket().close();
+
+  // wait for the falseLeader to disconnect
+  waitForOne(servers.zk[falseLeader], States.CONNECTING);
+
+  // convince falseLeader that it is the leader
+  
servers.mt[falseLeader].main.quorumPeer.setPeerState(QuorumPeer.ServerState.LEADING);
+
+  // provide time for the falseleader to realize no followers have 
connected
+  // (this is twice the timeout used in Leader#getEpochToPropose)
+  Thread.sleep(2 * 
servers.mt[falseLeader].main.quorumPeer.initLimit * 
servers.mt[falseLeader].main.quorumPeer.tickTime);
+
+  // Restart leader election
+  servers.mt[falseLeader].main.quorumPeer.startLeaderElection();
--- End diff --

Stopping and starting leader election is necessary here to prevent a race 
condition. It is possible that after the server is disconnected from the leader 
it becomes a follower before the test hits 
`servers.mt[falseLeader].main.quorumPeer.setPeerState(QuorumPeer.ServerState.LEADING);`
 and falseLeader will never try to `lead`, defeating the purpose of the test.


---


[GitHub] zookeeper pull request #432: [WIP] ZOOKEEPER-2953: Flaky Test: testNoLogBefo...

2017-12-13 Thread afine
Github user afine commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/432#discussion_r156780239
  
--- Diff: 
src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerMainTest.java ---
@@ -335,6 +336,100 @@ public void testHighestZxidJoinLate() throws 
Exception {
 output[0], 2);
 }
 
+/**
+ * This test validates that if a quorum member determines that it is 
leader without the support of the rest of the
+ * quorum (the other members do not believe it to be the leader) it 
will stop attempting to lead and become a follower.
+ *
+ * @throws IOException
+ * @throws InterruptedException
+ */
+@Test
+public void testElectionFraud() throws IOException, 
InterruptedException {
+// capture QuorumPeer logging
+Layout layout = 
Logger.getRootLogger().getAppender("CONSOLE").getLayout();
+ByteArrayOutputStream os = new ByteArrayOutputStream();
+WriterAppender appender = new WriterAppender(layout, os);
+appender.setThreshold(Level.INFO);
+Logger qlogger = Logger.getLogger(QuorumPeer.class);
+qlogger.addAppender(appender);
+
+int numServers = 3;
+
+// used for assertions later
+boolean foundLeading = false;
+boolean foundLooking = false;
+boolean foundFollowing = false;
+
+try {
+  // spin up a quorum, we use a small ticktime to make the test 
run faster
+  Servers servers = LaunchServers(numServers, 500);
--- End diff --

Fixed, forgot to put that in this branch.


---


[GitHub] zookeeper pull request #432: [WIP] ZOOKEEPER-2953: Flaky Test: testNoLogBefo...

2017-12-13 Thread anmolnar
Github user anmolnar commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/432#discussion_r156658012
  
--- Diff: 
src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerMainTest.java ---
@@ -335,6 +336,100 @@ public void testHighestZxidJoinLate() throws 
Exception {
 output[0], 2);
 }
 
+/**
+ * This test validates that if a quorum member determines that it is 
leader without the support of the rest of the
+ * quorum (the other members do not believe it to be the leader) it 
will stop attempting to lead and become a follower.
+ *
+ * @throws IOException
+ * @throws InterruptedException
+ */
+@Test
+public void testElectionFraud() throws IOException, 
InterruptedException {
+// capture QuorumPeer logging
+Layout layout = 
Logger.getRootLogger().getAppender("CONSOLE").getLayout();
+ByteArrayOutputStream os = new ByteArrayOutputStream();
+WriterAppender appender = new WriterAppender(layout, os);
+appender.setThreshold(Level.INFO);
+Logger qlogger = Logger.getLogger(QuorumPeer.class);
+qlogger.addAppender(appender);
+
+int numServers = 3;
+
+// used for assertions later
+boolean foundLeading = false;
+boolean foundLooking = false;
+boolean foundFollowing = false;
+
+try {
+  // spin up a quorum, we use a small ticktime to make the test 
run faster
+  Servers servers = LaunchServers(numServers, 500);
+
+  // find the leader
+  int trueLeader = -1;
+  for (int i = 0; i < numServers; i++) {
+if (servers.mt[i].main.quorumPeer.leader != null) {
+  trueLeader = i;
+}
+  }
+  Assert.assertTrue("There should be a leader", trueLeader >= 0);
+
+  // find a follower
+  int falseLeader = (trueLeader + 1) % numServers;
+  
Assert.assertTrue(servers.mt[falseLeader].main.quorumPeer.follower != null);
--- End diff --

Minor: I believe that it's generally a good idea to add a short message to 
asserts to describe what could be the problem when assertion fails.


---


[GitHub] zookeeper pull request #432: [WIP] ZOOKEEPER-2953: Flaky Test: testNoLogBefo...

2017-12-13 Thread anmolnar
Github user anmolnar commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/432#discussion_r156660196
  
--- Diff: 
src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerMainTest.java ---
@@ -335,6 +336,100 @@ public void testHighestZxidJoinLate() throws 
Exception {
 output[0], 2);
 }
 
+/**
+ * This test validates that if a quorum member determines that it is 
leader without the support of the rest of the
+ * quorum (the other members do not believe it to be the leader) it 
will stop attempting to lead and become a follower.
+ *
+ * @throws IOException
+ * @throws InterruptedException
+ */
+@Test
+public void testElectionFraud() throws IOException, 
InterruptedException {
+// capture QuorumPeer logging
+Layout layout = 
Logger.getRootLogger().getAppender("CONSOLE").getLayout();
+ByteArrayOutputStream os = new ByteArrayOutputStream();
+WriterAppender appender = new WriterAppender(layout, os);
+appender.setThreshold(Level.INFO);
+Logger qlogger = Logger.getLogger(QuorumPeer.class);
+qlogger.addAppender(appender);
+
+int numServers = 3;
+
+// used for assertions later
+boolean foundLeading = false;
+boolean foundLooking = false;
+boolean foundFollowing = false;
+
+try {
+  // spin up a quorum, we use a small ticktime to make the test 
run faster
+  Servers servers = LaunchServers(numServers, 500);
+
+  // find the leader
+  int trueLeader = -1;
+  for (int i = 0; i < numServers; i++) {
+if (servers.mt[i].main.quorumPeer.leader != null) {
+  trueLeader = i;
+}
+  }
+  Assert.assertTrue("There should be a leader", trueLeader >= 0);
+
+  // find a follower
+  int falseLeader = (trueLeader + 1) % numServers;
+  
Assert.assertTrue(servers.mt[falseLeader].main.quorumPeer.follower != null);
+
+  // to keep the quorum peer running and force it to go into the 
looking state, we kill leader election
+  // and close the connection to the leader
+  servers.mt[falseLeader].main.quorumPeer.electionAlg.shutdown();
+  
servers.mt[falseLeader].main.quorumPeer.follower.getSocket().close();
+
+  // wait for the falseLeader to disconnect
+  waitForOne(servers.zk[falseLeader], States.CONNECTING);
+
+  // convince falseLeader that it is the leader
+  
servers.mt[falseLeader].main.quorumPeer.setPeerState(QuorumPeer.ServerState.LEADING);
+
+  // provide time for the falseleader to realize no followers have 
connected
+  // (this is twice the timeout used in Leader#getEpochToPropose)
+  Thread.sleep(2 * 
servers.mt[falseLeader].main.quorumPeer.initLimit * 
servers.mt[falseLeader].main.quorumPeer.tickTime);
+
+  // Restart leader election
+  servers.mt[falseLeader].main.quorumPeer.startLeaderElection();
--- End diff --

I wonder if this one has to be called explicitly. 

The peer should automatically realise that it's no longer the leader, stop 
leading and start leader election (which is basically looking). That's what 
this test is intended to validate and shouldn't be started explicitly.

Correct me if I'm wrong.


---


[GitHub] zookeeper pull request #432: [WIP] ZOOKEEPER-2953: Flaky Test: testNoLogBefo...

2017-12-13 Thread anmolnar
Github user anmolnar commented on a diff in the pull request:

https://github.com/apache/zookeeper/pull/432#discussion_r156657072
  
--- Diff: 
src/java/test/org/apache/zookeeper/server/quorum/QuorumPeerMainTest.java ---
@@ -335,6 +336,100 @@ public void testHighestZxidJoinLate() throws 
Exception {
 output[0], 2);
 }
 
+/**
+ * This test validates that if a quorum member determines that it is 
leader without the support of the rest of the
+ * quorum (the other members do not believe it to be the leader) it 
will stop attempting to lead and become a follower.
+ *
+ * @throws IOException
+ * @throws InterruptedException
+ */
+@Test
+public void testElectionFraud() throws IOException, 
InterruptedException {
+// capture QuorumPeer logging
+Layout layout = 
Logger.getRootLogger().getAppender("CONSOLE").getLayout();
+ByteArrayOutputStream os = new ByteArrayOutputStream();
+WriterAppender appender = new WriterAppender(layout, os);
+appender.setThreshold(Level.INFO);
+Logger qlogger = Logger.getLogger(QuorumPeer.class);
+qlogger.addAppender(appender);
+
+int numServers = 3;
+
+// used for assertions later
+boolean foundLeading = false;
+boolean foundLooking = false;
+boolean foundFollowing = false;
+
+try {
+  // spin up a quorum, we use a small ticktime to make the test 
run faster
+  Servers servers = LaunchServers(numServers, 500);
--- End diff --

It'd be useful to utilize the class-wide ```servers``` field which is used 
to shutdown test servers in the ```tearDown()``` method.

For example ```testHighestZxidJoinLate()``` test does it that way.


---


[GitHub] zookeeper pull request #432: [WIP] ZOOKEEPER-2953: Flaky Test: testNoLogBefo...

2017-12-12 Thread afine
Github user afine closed the pull request at:

https://github.com/apache/zookeeper/pull/432


---


[GitHub] zookeeper pull request #432: [WIP] ZOOKEEPER-2953: Flaky Test: testNoLogBefo...

2017-12-12 Thread afine
GitHub user afine reopened a pull request:

https://github.com/apache/zookeeper/pull/432

[WIP] ZOOKEEPER-2953: Flaky Test: testNoLogBeforeLeaderEstablishment



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/afine/zookeeper ZOOKEEPER-2953

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/432.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #432


commit 68a0382093da1d64583211746ac672ed12f5da5c
Author: Abraham Fine 
Date:   2017-12-13T00:35:50Z

ZOOKEEPER-2953: Flaky Test: testNoLogBeforeLeaderEstablishment




---


[GitHub] zookeeper pull request #432: [WIP] ZOOKEEPER-2953: Flaky Test: testNoLogBefo...

2017-12-12 Thread afine
GitHub user afine opened a pull request:

https://github.com/apache/zookeeper/pull/432

[WIP] ZOOKEEPER-2953: Flaky Test: testNoLogBeforeLeaderEstablishment



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/afine/zookeeper ZOOKEEPER-2953

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/zookeeper/pull/432.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #432


commit 68a0382093da1d64583211746ac672ed12f5da5c
Author: Abraham Fine 
Date:   2017-12-13T00:35:50Z

ZOOKEEPER-2953: Flaky Test: testNoLogBeforeLeaderEstablishment




---