Anyone know why Patrick's log file might be showing a lot of this before the error?
2011-11-06 01:02:39,905 [myid:2] - INFO [Thread-76:NIOServerCnxn$StatCommand@655] - Stat command output This test never does a stat call, it uses a ZK client to connect in. This seems strange, perhaps the issue is a test setup one? C On Mon, Nov 7, 2011 at 6:23 PM, Patrick Hunt <ph...@apache.org> wrote: > That's fine (direction re 1-4). However my CI branch 3.4 build failed > over the w/e (once out of four runs). This is AFTER "Preparing for > release 3.4.0 - take 2" was applied (so testing includes 1270, 1264, > etc...) > > Notice testEarlyLeaderAbandonment is failing. I have attached the log > file to ZOOKEEPER-1270 JIRA: > https://issues.apache.org/jira/secure/attachment/12502838/testEarlyLeaderAbandonment5.txt.gz > > java.lang.RuntimeException: Waiting too long > at > org.apache.zookeeper.server.quorum.QuorumPeerMainTest.waitForAll(QuorumPeerMainTest.java:324) > at > org.apache.zookeeper.server.quorum.QuorumPeerMainTest.testEarlyLeaderAbandonment(QuorumPeerMainTest.java:195) > at > org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52) > > Should I reopen 1270, or a new jira, or... ? LMK. > > Note - I'm feeling quite ill so I have limited time to provide f/b & > test for the next day or so. > > Patrick > > On Sat, Nov 5, 2011 at 12:22 PM, Flavio Junqueira <f...@yahoo-inc.com> wrote: >> I'm fine with your proposal. -Flavio >> >> On Nov 5, 2011, at 8:15 PM, Camille Fournier wrote: >> >>> 2 has been flaky for so long, not sure whether it's worth being a blocker. >>> The AsyncHammerTests never pass for me locally. Not sure if it's a >>> problem or not... I am tempted to go with Mahadev on this and get this >>> 3.4 release out the door. I would be happy to help manage a 3.4.1 >>> release soon thereafter if we find serious issues. >>> >>> C >>> >>> On Sat, Nov 5, 2011 at 3:01 PM, Flavio Junqueira <f...@yahoo-inc.com> >>> wrote: >>>> >>>> If 2) is flakey, we need to fix it, no? >>>> >>>> -Flavio >>>> >>>> On Nov 5, 2011, at 6:14 PM, Patrick Hunt wrote: >>>> >>>>> I ran the 1270-1194 patch continually overnight (trunk) in my ci env, >>>>> after ~25 test runs I saw 4 failures: >>>>> >>>>> 1) #402 - QuorumTest.testFollowersStartAfterLeader >>>>> 2) #407 - org.apache.zookeeper.test.FLETest.testLE >>>>> 3) #410 - org.apache.zookeeper.test.AsyncHammerTest.testHammer >>>>> 4) #415 - org.apache.zookeeper.test.AsyncHammerTest.testHammer >>>>> >>>>> 1) client could not connect to reestablished quorum: giving up after >>>>> 30+ seconds. >>>>> 2) known flakey test >>>>> 3) QP failed to shutdown in 30 seconds: >>>>> QuorumPeer[myid=3]0.0.0.0/0.0.0.0:11224 >>>>> 4) QP failed to shutdown in 30 seconds: >>>>> QuorumPeer[myid=1]0.0.0.0/0.0.0.0:11222 >>>>> >>>>> On the plus side no "testearlyleaderabandon" failures. >>>>> >>>>> On the minus side 3/4 are a bit worrysome. Searching back through all >>>>> my previous failures I don't see this happening. Perhaps these changes >>>>> have shifted some timing? My main concern is that this might be caused >>>>> directly by the patch itself.... >>>>> >>>>> Patrick >>>> >>>> flavio >>>> junqueira >>>> >>>> research scientist >>>> >>>> f...@yahoo-inc.com >>>> direct +34 93-183-8828 >>>> >>>> avinguda diagonal 177, 8th floor, barcelona, 08018, es >>>> phone (408) 349 3300 fax (408) 349 3301 >>>> >>>> >> >> flavio >> junqueira >> >> research scientist >> >> f...@yahoo-inc.com >> direct +34 93-183-8828 >> >> avinguda diagonal 177, 8th floor, barcelona, 08018, es >> phone (408) 349 3300 fax (408) 349 3301 >> >> >