Also if you want to submit a patch that provides more insight (logs) for that operation/test lmk and I'll be happy to review/commit it. Should help with debugging the issue and debugging in the field.
Thanks! Patrick On Tue, Jul 22, 2014 at 12:17 PM, Patrick Hunt <ph...@apache.org> wrote: > Here's the logs (attached) for the test that failed. Nothing stuck out > at me - anything ring a bell? > > Patrick > > On Tue, Jul 22, 2014 at 12:10 PM, Alexander Shraer <shra...@gmail.com> wrote: >> Unfortunately doesn't look like we have enough logging going on there. >> For example would be nice to know what's the committed config and last seen >> config >> of the leader when it comes up (leader.lead()). and what configuration is >> sent in the NEWLEADER message >> sent out in LeaderHandler: >> >> QuorumPacket newLeaderQP = new >> QuorumPacket(Leader.NEWLEADER, >> newLeaderZxid, >> leader.self.getLastSeenQuorumVerifier() >> .toString().getBytes(), null); >> >> >> I didn't know about the option to have a separate administrative interface, >> and just followed the flow of other commands... I agree that it would be >> cleaner. >> >> >> >> >> On Tue, Jul 22, 2014 at 11:36 AM, Patrick Hunt <ph...@apache.org> wrote: >> >>> On Tue, Jul 22, 2014 at 11:29 AM, Alexander Shraer <shra...@gmail.com> >>> wrote: >>> > Hmm. It doesn't really make sense to me - the reconfig should be >>> completed >>> > before >>> > the servers come up and process new ops. We submitted the reconfig to >>> > server 1, it timed out >>> > on new quorum, but when 1 becomes leader again after 2 restarts 1 should >>> > complete the reconfig. >>> > is 1 becoming leader after 2 restarts ? >>> > >>> >>> What should I look for in the logs? Any specific log messages that >>> would help debug? >>> >>> > About admin controls - reconfig/getConfig are open to everyone, unless >>> you >>> > set permissions on the configuration znode being written during reconfig. >>> > nodeRecord = getRecordForPath(ZooDefs.CONFIG_NODE); >>> > >>> > checkACL(zks, nodeRecord.acl, ZooDefs.Perms.WRITE, >>> > request.authInfo); >>> > >>> >>> So I can turn off all access then? (read and write). Should we ship >>> that as the default? We should add that to the docs. >>> >>> In the past we've always tried to hide this type of information from >>> clients (e.g. we don't expose the zk server address to the client for >>> a session). This seems like a very big departure. Why didn't we move >>> it to a separate, administrative, interface? >>> >>> Patrick >>> >>> > >>> > >>> > On Tue, Jul 22, 2014 at 11:16 AM, Patrick Hunt <phu...@gmail.com> wrote: >>> > >>> >> Looks like 3 hasn't been removed (unfortunately the assertion doesn't >>> >> include any msg detail, but that's the way it looks to me like the >>> >> test is setup): >>> >> >>> >> if (leavingServers != null) { >>> >> for (String leaving : leavingServers) >>> >> >>> >> Assert.assertFalse(configStr.contains("server.".concat(leaving))); >>> >> } >>> >> >>> >> which is called from: >>> >> >>> >> qu.restart(2); >>> >> // Now that 2 is back up, they'll complete the reconfig >>> removing 3 >>> >> and >>> >> // can process other ops. >>> >> testServerHasConfig(zkArr[1], null, leavingServers); >>> >> >>> >> It seems like the problem is that testServerHasConfig is not waiting >>> >> for the configuration to be updated? In this case 2 was just restarted >>> >> and 3 hasn't had a chance to be removed? (on a slower machine say, >>> >> which might be why you aren't seeing the issue? hence the flakeyness) >>> >> >>> >> Patrick >>> >> >>> >> On Tue, Jul 22, 2014 at 10:57 AM, Alexander Shraer <shra...@gmail.com> >>> >> wrote: >>> >> > Hi Patrick, I'm not sure why you're seeing this - it consistently >>> passes >>> >> on >>> >> > my machine. In case you'd like to take a look, the test has tons of >>> >> > comments explaining the scenario. Let me know how I can help. >>> >> > >>> >> > >>> >> > On Tue, Jul 22, 2014 at 9:53 AM, Patrick Hunt <ph...@apache.org> >>> wrote: >>> >> > >>> >> >> Hi Alex, I've also seen the test "testLeaderTimesoutOnNewQuorum" fail >>> >> >> multiple times (not every time, but ~50%, so flakey) in the last few >>> >> >> days. It's failing both on jdk6 and jdk7. (this is my personal >>> >> >> jenkins, I haven't see any other failures than this during the past >>> >> >> few days). >>> >> >> >>> >> >> junit.framework.AssertionFailedError >>> >> >> at >>> >> >> >>> >> >>> org.apache.zookeeper.test.ReconfigTest.testServerHasConfig(ReconfigTest.java:127) >>> >> >> at >>> >> >> >>> >> >>> org.apache.zookeeper.test.ReconfigTest.testLeaderTimesoutOnNewQuorum(ReconfigTest.java:450) >>> >> >> at >>> >> >> >>> >> >>> org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52) >>> >> >> >>> >> >> Patrick >>> >> >> >>> >> >> On Tue, Jul 22, 2014 at 8:37 AM, Alexander Shraer <shra...@gmail.com >>> > >>> >> >> wrote: >>> >> >> > Hi Rakesh, >>> >> >> > >>> >> >> > Thanks for looking at this. In general even if we find the bug >>> since >>> >> we >>> >> >> > should test it before committing a fix, it seems better to remove >>> the >>> >> >> test >>> >> >> > for now and debug this on a build machine. I'm trying to get >>> access to >>> >> >> it. >>> >> >> > >>> >> >> > Looking at this log: >>> >> >> > >>> >> >> >>> >> >>> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper-trunk/2380/testReport/org.apache.zookeeper.server.quorum/ReconfigRecoveryTest/testCurrentObserverIsParticipantInNewConfig/ >>> >> >> > >>> >> >> > Something weird is going on. Sever 3 hasn't started yet, but >>> version >>> >> >> 200000000 >>> >> >> > is already being sent around as committed! >>> >> >> > >>> >> >> > 2014-07-21 10:44:50,901 [myid:2] - INFO >>> >> >> > >>> >> [WorkerReceiver[myid=2]:FastLeaderElection$Messenger$WorkerReceiver@293 >>> ] >>> >> >> > - 2 Received version: 200000000 my version: 0 >>> >> >> > >>> >> >> > >>> >> >> > and also in leader election messages. >>> >> >> > >>> >> >> > Also weird is that the version of 2 is 0 as if it is a joiner, >>> >> whereas we >>> >> >> > explicitly started it with 100000000. >>> >> >> > Then it makes sense that the new config can't be committed since >>> its >>> >> >> > version is not high enough... >>> >> >> > >>> >> >> > I wonder if its possible that not all servers from the previous >>> test >>> >> are >>> >> >> > dead and they are interfering... >>> >> >> > >>> >> >> > >>> >> >> > On Tue, Jul 22, 2014 at 3:53 AM, Rakesh R <rake...@huawei.com> >>> wrote: >>> >> >> > >>> >> >> >> Hi Alex, >>> >> >> >> >>> >> >> >> Yeah it is consistently passing in my machine also. >>> >> >> >> >>> >> >> >> >>> >> >> >> I have quickly gone through the >>> >> >> >> testCurrentObserverIsParticipantInNewConfig failure logs in >>> >> >> >> PreCommit-ZOOKEEPER-Build. It looks like 200000000 (n.config >>> version) >>> >> >> has >>> >> >> >> not taken and still leader election is seeing 100000000 (n.config >>> >> >> version). >>> >> >> >> Unfortunately I didn't find the reason for not considering the >>> >> updated >>> >> >> >> config version. >>> >> >> >> >>> >> >> >> >>> >> >> >> Reference: >>> >> >> >> >>> >> >> >>> >> >>> https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2213/testReport/junit/org.apache.zookeeper.server.quorum/ReconfigRecoveryTest/testCurrentObserverIsParticipantInNewConfig >>> >> >> >> >>> >> >> >> 2014-07-22 06:38:00,330 [myid:1] - INFO >>> >> >> >> [QuorumPeer[myid=1]/127.0.0.1:11298:FastLeaderElection@922] - >>> >> >> >> Notification time out: 51200 >>> >> >> >> 2014-07-22 06:38:00,330 [myid:1] - INFO >>> >> >> >> [WorkerReceiver[myid=1]:FastLeaderElection@682] - Notification: >>> 2 >>> >> >> >> (message format version), 1 (n.leader), 0x100000005 (n.zxid), 0x1 >>> >> >> >> (n.round), LOOKING (n.state), 1 (n.sid), 0x1 (n.peerEPoch), >>> LOOKING >>> >> (my >>> >> >> >> state)100000000 (n.config version) >>> >> >> >> 2014-07-22 06:38:00,331 [myid:2] - INFO >>> >> >> >> [WorkerReceiver[myid=2]:FastLeaderElection@682] - Notification: >>> 2 >>> >> >> >> (message format version), 1 (n.leader), 0x100000005 (n.zxid), 0x1 >>> >> >> >> (n.round), LOOKING (n.state), 2 (n.sid), 0x1 (n.peerEPoch), >>> LOOKING >>> >> (my >>> >> >> >> state)100000000 (n.config version) >>> >> >> >> 2014-07-22 06:38:00,330 [myid:2] - INFO >>> >> >> >> [QuorumPeer[myid=2]/127.0.0.1:11301:FastLeaderElection@922] - >>> >> >> >> Notification time out: 51200 >>> >> >> >> 2014-07-22 06:38:00,331 [myid:0] - INFO >>> >> >> >> [WorkerReceiver[myid=0]:FastLeaderElection@682] - Notification: >>> 2 >>> >> >> >> (message format version), 1 (n.leader), 0x100000005 (n.zxid), 0x1 >>> >> >> >> (n.round), LOOKING (n.state), 1 (n.sid), 0x1 (n.peerEPoch), >>> LOOKING >>> >> (my >>> >> >> >> state)100000000 (n.config version) >>> >> >> >> 2014-07-22 06:38:00,331 [myid:2] - INFO >>> >> >> >> [WorkerReceiver[myid=2]:FastLeaderElection@682] - Notification: >>> 2 >>> >> >> >> (message format version), 1 (n.leader), 0x100000005 (n.zxid), 0x1 >>> >> >> >> (n.round), LOOKING (n.state), 1 (n.sid), 0x1 (n.peerEPoch), >>> LOOKING >>> >> (my >>> >> >> >> state)100000000 (n.config version) >>> >> >> >> >>> >> >> >> >>> >> >> >> 2014-07-22 06:38:00,332 [myid:0] - INFO >>> >> >> >> [WorkerReceiver[myid=0]:FastLeaderElection@682] - Notification: >>> 2 >>> >> >> >> (message format version), 1 (n.leader), 0x100000005 (n.zxid), 0x1 >>> >> >> >> (n.round), LOOKING (n.state), 2 (n.sid), 0x1 (n.peerEPoch), >>> LOOKING >>> >> (my >>> >> >> >> state)100000000 (n.config version) >>> >> >> >> 2014-07-22 06:38:00,332 [myid:1] - INFO >>> >> >> >> [WorkerReceiver[myid=1]:FastLeaderElection@682] - Notification: >>> 2 >>> >> >> >> (message format version), 1 (n.leader), 0x100000005 (n.zxid), 0x1 >>> >> >> >> (n.round), LOOKING (n.state), 2 (n.sid), 0x1 (n.peerEPoch), >>> LOOKING >>> >> (my >>> >> >> >> state)100000000 (n.config version) >>> >> >> >> >>> >> >> >> >>> >> >> >> -Rakesh >>> >> >> >> >>> >> >> >> -----Original Message----- >>> >> >> >> From: Alexander Shraer [mailto:shra...@gmail.com] >>> >> >> >> Sent: 22 July 2014 11:57 >>> >> >> >> To: dev@zookeeper.apache.org >>> >> >> >> Subject: Re: ZooKeeper 3.5.0-alpha planning >>> >> >> >> >>> >> >> >> I tried to look into it, but the test consistently passes locally >>> on >>> >> two >>> >> >> >> machines. >>> >> >> >> I don't currently have access to the build machine, but I can try >>> to >>> >> ask >>> >> >> >> for access. >>> >> >> >> Unless anyone has a better suggestion, we could remove the failing >>> >> test >>> >> >> in >>> >> >> >> the meanwhile and open a JIRA to add it back... >>> >> >> >> >>> >> >> >> >>> >> >> >> On Mon, Jul 21, 2014 at 10:09 PM, Patrick Hunt <ph...@apache.org> >>> >> >> wrote: >>> >> >> >> >>> >> >> >> > I'm seeing alot of test failures in >>> >> >> >> > testCurrentObserverIsParticipantInNewConfig could someone take a >>> >> look? >>> >> >> >> > Seems related to ZOOKEEPER-1807 recent commit. >>> >> >> >> > >>> >> >> >> > >>> >> >> >> > >>> >> >> >>> https://issues.apache.org/jira/browse/ZOOKEEPER-1807?focusedCommentId= >>> >> >> >> > >>> >> 14069024&page=com.atlassian.jira.plugin.system.issuetabpanels:comment- >>> >> >> >> > tabpanel#comment-14069024 >>> >> >> >> > >>> >> >> >> > Patrick >>> >> >> >> > >>> >> >> >> > On Mon, Jul 21, 2014 at 11:12 AM, Rakesh Radhakrishnan >>> >> >> >> > <rakeshr.apa...@gmail.com> wrote: >>> >> >> >> > > lgtm +1 >>> >> >> >> > > >>> >> >> >> > > >>> >> >> >> > > On Mon, Jul 21, 2014 at 11:37 PM, FPJ >>> >> >> >> > > <fpjunque...@yahoo.com.invalid> >>> >> >> >> > wrote: >>> >> >> >> > > >>> >> >> >> > >> +1 for having an RC this week. Since this is an alpha >>> release, I >>> >> >> >> > >> +think >>> >> >> >> > 72 >>> >> >> >> > >> biz hours is enough for the vote. >>> >> >> >> > >> >>> >> >> >> > >> -Flavio >>> >> >> >> > >> >>> >> >> >> > >> > -----Original Message----- >>> >> >> >> > >> > From: Patrick Hunt [mailto:ph...@apache.org] >>> >> >> >> > >> > Sent: 21 July 2014 18:55 >>> >> >> >> > >> > To: DevZooKeeper >>> >> >> >> > >> > Subject: Re: ZooKeeper 3.5.0-alpha planning >>> >> >> >> > >> > >>> >> >> >> > >> > I fixed a number of issues. I also started a few threads >>> with >>> >> >> >> > >> > builds@ >>> >> >> >> > >> > - the ulimit issue is still outstanding. Hongchao and I >>> worked >>> >> >> >> > through a >>> >> >> >> > >> > number of findbugs issues, it's not closed yet but it's >>> pretty >>> >> >> >> close. >>> >> >> >> > >> > >>> >> >> >> > >> > I don't see why we can't create an RC and start voting this >>> >> week >>> >> >> >> > though. >>> >> >> >> > >> > Anyone disagree? >>> >> >> >> > >> > >>> >> >> >> > >> > How long should we let the vote run, the std 72 biz hours >>> or >>> >> >> >> > >> > should we >>> >> >> >> > >> plan >>> >> >> >> > >> > for more to allow folks more time to test? >>> >> >> >> > >> > >>> >> >> >> > >> > Patrick >>> >> >> >> > >> > >>> >> >> >> > >> > On Mon, Jul 21, 2014 at 10:29 AM, Raúl Gutiérrez Segalés >>> >> >> >> > >> > <r...@itevenworks.net> wrote: >>> >> >> >> > >> > > On 18 July 2014 10:32, Patrick Hunt <ph...@apache.org> >>> >> wrote: >>> >> >> >> > >> > > >>> >> >> >> > >> > >> You may notice some back/forth on Apache Jenkins ZK >>> jobs - >>> >> I'm >>> >> >> >> > trying >>> >> >> >> > >> > >> to fix some of the jobs that were broken during the >>> recent >>> >> >> >> > >> > >> host upgrade. >>> >> >> >> > >> > >> >>> >> >> >> > >> > > >>> >> >> >> > >> > > How are things looking? Is it likely that we can have a >>> >> 3.5.0 >>> >> >> >> > >> > > alpha release week or are we still blocked on Jenkins? >>> >> >> >> > >> > > >>> >> >> >> > >> > > >>> >> >> >> > >> > > -rgs >>> >> >> >> > >> > > >>> >> >> >> > >> > > >>> >> >> >> > >> > > >>> >> >> >> > >> > > >>> >> >> >> > >> > > >>> >> >> >> > >> > > >>> >> >> >> > >> > >> Patrick >>> >> >> >> > >> > >> >>> >> >> >> > >> > >> On Thu, Jul 17, 2014 at 1:47 PM, Michi Mutsuzaki >>> >> >> >> > >> > >> <mi...@cs.stanford.edu> >>> >> >> >> > >> > >> wrote: >>> >> >> >> > >> > >> > I'll check in ZOOKEEPER-1683. >>> >> >> >> > >> > >> > >>> >> >> >> > >> > >> > On Thu, Jul 17, 2014 at 11:20 AM, Alexander Shraer >>> >> >> >> > >> > >> > <shra...@gmail.com> >>> >> >> >> > >> > >> wrote: >>> >> >> >> > >> > >> >> can we also have ZOOKEEPER-1683 in ? Camille gave a >>> +1 >>> >> and >>> >> >> >> > >> > >> >> all >>> >> >> >> > >> > >> subsequent >>> >> >> >> > >> > >> >> changes were formatting as suggested by Rakesh. >>> >> >> >> > >> > >> >> >>> >> >> >> > >> > >> >> >>> >> >> >> > >> > >> >> On Thu, Jul 17, 2014 at 9:48 AM, Patrick Hunt >>> >> >> >> > >> > >> >> <ph...@apache.org >>> >> >> >> > > >>> >> >> >> > >> > wrote: >>> >> >> >> > >> > >> >> >>> >> >> >> > >> > >> >>> I'm concerned that the CI tests are all failing due >>> to, >>> >> >> >> > >> > >> >>> for >>> >> >> >> > e.g. >>> >> >> >> > >> > >> >>> findbugs issues. At the very least our build/test/ci >>> >> >> >> > >> > >> >>> should be pretty clean - some flakeys is ok (the >>> recent >>> >> >> >> > >> > >> >>> startServer fix >>> >> >> >> > and >>> >> >> >> > >> > >> >>> some other flakeys that have been addressed go a >>> long >>> >> way >>> >> >> >> > >> > >> >>> on >>> >> >> >> > that >>> >> >> >> > >> > >> >>> issue) but I think the findbugs problem should be >>> >> cleaned >>> >> >> >> > >> > >> >>> up before we cut a release. I started a separate >>> >> thread to >>> >> >> >> > >> > >> >>> discuss >>> >> >> >> > >> the >>> >> >> >> > >> > findbugs issue. >>> >> >> >> > >> > >> >>> >>> >> >> >> > >> > >> >>> Otw we seem to be in ok shape - 1863 is in. >>> >> >> >> > >> > >> >>> >>> >> >> >> > >> > >> >>> Anyone have a chance to give feedback to Raul on >>> 1919? >>> >> >> >> > >> > >> >>> >>> >> >> >> > >> > >> >>> Patrick >>> >> >> >> > >> > >> >>> >>> >> >> >> > >> > >> >>> On Tue, Jul 15, 2014 at 10:34 AM, Flavio Junqueira >>> >> >> >> > >> > >> >>> <fpjunque...@yahoo.com.invalid> wrote: >>> >> >> >> > >> > >> >>> > My take: >>> >> >> >> > >> > >> >>> > >>> >> >> >> > >> > >> >>> > - ZK-1863 is pending review. It is a blocker and >>> it >>> >> can >>> >> >> >> > >> > >> >>> > go >>> >> >> >> > in. >>> >> >> >> > >> > >> >>> > See >>> >> >> >> > >> > >> the >>> >> >> >> > >> > >> >>> jira for comments. >>> >> >> >> > >> > >> >>> > - We can try to have ZK-1807 in for the first >>> alpha. >>> >> >> >> > >> > >> >>> > - I'd rather not have the first alpha depending on >>> >> >> >> > >> > >> >>> > ZK-1919 >>> >> >> >> > and >>> >> >> >> > >> > >> ZK-1910, >>> >> >> >> > >> > >> >>> we can leave it for the second alpha. >>> >> >> >> > >> > >> >>> > >>> >> >> >> > >> > >> >>> > If you agree with this, then we should be able to >>> >> cut a >>> >> >> >> > >> > >> >>> > candidate by >>> >> >> >> > >> > >> the >>> >> >> >> > >> > >> >>> end of this week. >>> >> >> >> > >> > >> >>> > >>> >> >> >> > >> > >> >>> > -Flavio >>> >> >> >> > >> > >> >>> > >>> >> >> >> > >> > >> >>> > On 15 Jul 2014, at 17:26, Patrick Hunt >>> >> >> >> > >> > >> >>> > <ph...@apache.org> >>> >> >> >> > >> wrote: >>> >> >> >> > >> > >> >>> > >>> >> >> >> > >> > >> >>> >> Per my previous note you can now see the c client >>> >> test >>> >> >> >> > >> > >> >>> >> log output >>> >> >> >> > >> > >> here >>> >> >> >> > >> > >> >>> >> in the "build artifacts" section: >>> >> >> >> > >> > >> >>> >> >>> >> >> >> > >> > >> >>> >>> >> >> >> > >> > >> >>> >> >> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeepe >>> >> >> >> > >> > >> r- >>> >> >> >> > >> > trunk >>> >> >> >> > >> > >> /2372/ >>> >> >> >> > >> > >> >>> >> >>> >> >> >> > >> > >> >>> >> Patrick >>> >> >> >> > >> > >> >>> >> >>> >> >> >> > >> > >> >>> >> On Mon, Jul 14, 2014 at 7:36 PM, Patrick Hunt >>> >> >> >> > >> > >> >>> >> <ph...@apache.org> >>> >> >> >> > >> > >> wrote: >>> >> >> >> > >> > >> >>> >>> Update: we're back to 8 blockers on 3.5.0 (not >>> >> clear >>> >> >> >> > >> > >> >>> >>> to me which >>> >> >> >> > >> > >> >>> >>> one(s?) is new?) >>> >> >> >> > >> > >> >>> >>> >>> >> >> >> > >> > >> >>> >>> Looks like the autoconf issue I reported is >>> hitting >>> >> >> >> > >> > >> >>> >>> the upgraded apache jenkins instances as well. >>> I've >>> >> >> >> > >> > >> >>> >>> updated the "archive" list >>> >> >> >> > >> > >> to >>> >> >> >> > >> > >> >>> >>> include the c tests stdout redirect. So while it >>> >> won't >>> >> >> >> > >> > >> >>> >>> go >>> >> >> >> > to >>> >> >> >> > >> > >> console >>> >> >> >> > >> > >> >>> >>> at least we can debug when there is a failure. >>> >> >> >> > >> > >> >>> >>> >>> >> >> >> > >> > >> >>> >>> Raul has been helping Bill with reviews for the >>> >> jetty >>> >> >> >> > server >>> >> >> >> > >> > >> support >>> >> >> >> > >> > >> >>> >>> and it looks like that should be ready soon. >>> >> >> >> > >> > >> >>> >>> >>> >> >> >> > >> > >> >>> >>> Raul also requested that someone prioritize >>> >> reviewing >>> >> >> >> > >> > >> "ZOOKEEPER-1919 >>> >> >> >> > >> > >> >>> >>> Update the C implementation of removeWatches to >>> >> have >>> >> >> >> > >> > >> >>> >>> it >>> >> >> >> > >> > match >>> >> >> >> > >> > >> >>> >>> ZOOKEEPER-1910" so that we can include it in >>> 3.5.0. >>> >> >> >> > >> Flavio/Michi? >>> >> >> >> > >> > >> >>> >>> >>> >> >> >> > >> > >> >>> >>> Hongchao got a patch in to cleanup the flakey c >>> >> client >>> >> >> >> > >> > >> >>> >>> reconfig >>> >> >> >> > >> > >> test - >>> >> >> >> > >> > >> >>> >>> kudos on helping cleanup the build/test infra! >>> >> >> >> > >> > >> >>> >>> >>> >> >> >> > >> > >> >>> >>> >>> >> >> >> > >> > >> >>> >>> Based on previous comments it looks like we're >>> >> pretty >>> >> >> >> > close. >>> >> >> >> > >> > >> >>> >>> Do >>> >> >> >> > >> > >> folks >>> >> >> >> > >> > >> >>> >>> feel comfortable with a 3.5.0 alpha at this >>> point? >>> >> >> >> > >> > >> >>> >>> (with a few >>> >> >> >> > >> > >> pending >>> >> >> >> > >> > >> >>> >>> as above) >>> >> >> >> > >> > >> >>> >>> >>> >> >> >> > >> > >> >>> >>> Patrick >>> >> >> >> > >> > >> >>> >>> >>> >> >> >> > >> > >> >>> >>> On Fri, Jul 11, 2014 at 9:24 AM, Raúl Gutiérrez >>> >> >> >> > >> > >> >>> >>> Segalés <r...@itevenworks.net> wrote: >>> >> >> >> > >> > >> >>> >>>> On Jul 11, 2014 6:37 AM, "Flavio Junqueira" >>> >> >> >> > >> > >> >>> <fpjunque...@yahoo.com.invalid> >>> >> >> >> > >> > >> >>> >>>> wrote: >>> >> >> >> > >> > >> >>> >>>>> >>> >> >> >> > >> > >> >>> >>>>> Just so that we don´t delay too much, what if >>> we >>> >> >> >> > >> > >> >>> >>>>> release >>> >> >> >> > an >>> >> >> >> > >> > >> >>> >>>>> alpha >>> >> >> >> > >> > >> >>> version >>> >> >> >> > >> > >> >>> >>>> without 1863 and 1807, and do another one in >>> 2-3 >>> >> >> >> > >> > >> >>> >>>> weeks >>> >> >> >> > time? >>> >> >> >> > >> > >> >>> >>>>> >>> >> >> >> > >> > >> >>> >>>> >>> >> >> >> > >> > >> >>> >>>> +1 >>> >> >> >> > >> > >> >>> >>>> >>> >> >> >> > >> > >> >>> >>>> -rgs >>> >> >> >> > >> > >> >>> >>>> >>> >> >> >> > >> > >> >>> >>>>> -Flavio >>> >> >> >> > >> > >> >>> >>>>> >>> >> >> >> > >> > >> >>> >>>>> >>> >> >> >> > >> > >> >>> >>>>> On Thursday, July 3, 2014 6:12 AM, Raúl >>> Gutiérrez >>> >> >> >> > Segalés < >>> >> >> >> > >> > >> >>> >>>> r...@itevenworks.net> wrote: >>> >> >> >> > >> > >> >>> >>>>> >>> >> >> >> > >> > >> >>> >>>>> >>> >> >> >> > >> > >> >>> >>>>>> >>> >> >> >> > >> > >> >>> >>>>>> >>> >> >> >> > >> > >> >>> >>>>>> On 2 July 2014 21:19, Patrick Hunt >>> >> >> >> > >> > >> >>> >>>>>> <ph...@apache.org> >>> >> >> >> > >> > wrote: >>> >> >> >> > >> > >> >>> >>>>>> >>> >> >> >> > >> > >> >>> >>>>>>> Update: we're down to 7 blockers on 5.1.0 >>> >> (from 8 >>> >> >> >> > >> > >> >>> >>>>>>> in >>> >> >> >> > the >>> >> >> >> > >> > >> >>> >>>>>>> last >>> >> >> >> > >> > >> >>> check). >>> >> >> >> > >> > >> >>> >>>>>>> 1810 is waiting on feedback from Michi, and >>> >> >> >> > >> > >> >>> >>>>>>> Camille is >>> >> >> >> > >> > >> threatening >>> >> >> >> > >> > >> >>> to >>> >> >> >> > >> > >> >>> >>>>>>> commit 1863. I see some great progress in >>> >> general >>> >> >> >> > >> > >> >>> >>>>>>> on >>> >> >> >> > the >>> >> >> >> > >> > >> >>> >>>>>>> patch availables queue, which is great to >>> see. >>> >> >> >> > >> > >> >>> >>>>>>> >>> >> >> >> > >> > >> >>> >>>>>>> So here's something else we might consider - >>> >> >> >> > >> > >> >>> >>>>>>> should we drop >>> >> >> >> > >> > >> jdk6 >>> >> >> >> > >> > >> >>> >>>>>>> support from 3.5. It's long since EOL by >>> Oracle >>> >> >> >> > >> > >> >>> >>>>>>> but I suspect >>> >> >> >> > >> > >> some >>> >> >> >> > >> > >> >>> >>>>>>> folks are still using ZK with 6. We gotta >>> move >>> >> >> >> > >> > >> >>> >>>>>>> forward though, >>> >> >> >> > >> > >> >>> can't >>> >> >> >> > >> > >> >>> >>>>>>> support it forever. Thoughts? Note that we >>> are >>> >> >> >> > currently >>> >> >> >> > >> > >> >>> >>>>>>> building/testing trunk against jdk6, 7 and >>> 8. >>> >> >> >> > >> > >> >>> >>>>>>> >>> >> >> https://builds.apache.org/view/S-Z/view/ZooKeeper/ >>> >> >> >> > >> > >> >>> >>>>>>> >>> >> >> >> > >> > >> >>> >>>>>> >>> >> >> >> > >> > >> >>> >>>>>> Extra eyes/review for >>> >> >> >> > >> > >> >>> >>>> >>> >> https://issues.apache.org/jira/browse/ZOOKEEPER-1807 >>> >> >> >> > >> > >> >>> >>>>>> would be appreciated (otherwise anyone using >>> >> >> >> > >> > >> >>> >>>>>> Observers with the >>> >> >> >> > >> > >> >>> upcoming >>> >> >> >> > >> > >> >>> >>>>>> alpha release will see there network usage go >>> >> >> >> wild...). >>> >> >> >> > >> > >> >>> >>>>>> >>> >> >> >> > >> > >> >>> >>>>>> >>> >> >> >> > >> > >> >>> >>>>>> -rgs >>> >> >> >> > >> > >> >>> >>>>>> >>> >> >> >> > >> > >> >>> >>>>>> >>> >> >> >> > >> > >> >>> >>>>>> >>> >> >> >> > >> > >> >>> >>>>>> >>> >> >> >> > >> > >> >>> >>>>>> >>> >> >> >> > >> > >> >>> >>>>>>> Patrick >>> >> >> >> > >> > >> >>> >>>>>>> >>> >> >> >> > >> > >> >>> >>>>>>> On Tue, Jul 1, 2014 at 2:26 AM, Flavio >>> >> Junqueira >>> >> >> >> > >> > >> >>> >>>>>>> <fpjunque...@yahoo.com.invalid> wrote: >>> >> >> >> > >> > >> >>> >>>>>>>> According to me, ZK-1810 should be in >>> already, >>> >> >> >> > >> > >> >>> >>>>>>>> but I need a +1 >>> >> >> >> > >> > >> >>> >>>> there. I >>> >> >> >> > >> > >> >>> >>>>>>> think Michi hasn't checked in because LETest >>> >> >> >> > >> > >> >>> >>>>>>> failed in the >>> >> >> >> > >> > >> last QA >>> >> >> >> > >> > >> >>> run >>> >> >> >> > >> > >> >>> >>>>>>> there. However, that patch doesn't affect >>> >> LETest, >>> >> >> >> > >> > >> >>> >>>>>>> and >>> >> >> >> > in >>> >> >> >> > >> > >> >>> >>>>>>> fact >>> >> >> >> > >> > >> it >>> >> >> >> > >> > >> >>> fails >>> >> >> >> > >> > >> >>> >>>> in >>> >> >> >> > >> > >> >>> >>>>>>> trunk intermittently, so the test failure >>> >> doesn't >>> >> >> >> > >> > >> >>> >>>>>>> seem >>> >> >> >> > to >>> >> >> >> > >> > >> >>> >>>>>>> be >>> >> >> >> > >> > >> >>> related >>> >> >> >> > >> > >> >>> >>>> to the >>> >> >> >> > >> > >> >>> >>>>>>> patch. >>> >> >> >> > >> > >> >>> >>>>>>>> >>> >> >> >> > >> > >> >>> >>>>>>>> I haven't checked ZK-1863, so I can't say >>> >> >> >> > >> > >> >>> >>>>>>>> anything concrete >>> >> >> >> > >> > >> about >>> >> >> >> > >> > >> >>> it. >>> >> >> >> > >> > >> >>> >>>>>>>> >>> >> >> >> > >> > >> >>> >>>>>>>> -Flavio >>> >> >> >> > >> > >> >>> >>>>>>>> >>> >> >> >> > >> > >> >>> >>>>>>>> >>> >> >> >> > >> > >> >>> >>>>>>>> >>> >> >> >> > >> > >> >>> >>>>>>>> On Tuesday, July 1, 2014 5:53 AM, Patrick >>> >> Hunt < >>> >> >> >> > >> > >> ph...@apache.org> >>> >> >> >> > >> > >> >>> >>>> wrote: >>> >> >> >> > >> > >> >>> >>>>>>>> >>> >> >> >> > >> > >> >>> >>>>>>>> >>> >> >> >> > >> > >> >>> >>>>>>>>> >>> >> >> >> > >> > >> >>> >>>>>>>>> >>> >> >> >> > >> > >> >>> >>>>>>>>> Hi Flavio, do you think those jiras can >>> get >>> >> >> >> > >> > >> reviewed/finalized >>> >> >> >> > >> > >> >>> before >>> >> >> >> > >> > >> >>> >>>>>>>>> the end of the week? I'd like to try >>> cutting >>> >> an >>> >> >> >> > >> > >> >>> >>>>>>>>> RC >>> >> >> >> > >> > soonish... >>> >> >> >> > >> > >> >>> >>>>>>>>> >>> >> >> >> > >> > >> >>> >>>>>>>>> Patrick >>> >> >> >> > >> > >> >>> >>>>>>>>> >>> >> >> >> > >> > >> >>> >>>>>>>>> >>> >> >> >> > >> > >> >>> >>>>>>>>> On Sun, Jun 29, 2014 at 5:02 AM, Flavio >>> >> >> >> > >> > >> >>> >>>>>>>>> Junqueira <fpjunque...@yahoo.com.invalid> >>> >> >> wrote: >>> >> >> >> > >> > >> >>> >>>>>>>>>> +1 for the plan of releasing alpha >>> versions. >>> >> >> >> > >> > >> >>> >>>>>>>>>> >>> >> >> >> > >> > >> >>> >>>>>>>>>> I'd like to have ZK-1818 (ZK-1810) and >>> >> ZK-1863 >>> >> >> in. >>> >> >> >> > >> > >> >>> >>>>>>>>>> They are >>> >> >> >> > >> > >> both >>> >> >> >> > >> > >> >>> >>>> patch >>> >> >> >> > >> > >> >>> >>>>>>> available. ZK-1870 is in trunk, but it is >>> still >>> >> >> >> > >> > >> >>> >>>>>>> open because we >>> >> >> >> > >> > >> >>> need a >>> >> >> >> > >> > >> >>> >>>> 3.4 >>> >> >> >> > >> > >> >>> >>>>>>> patch. >>> >> >> >> > >> > >> >>> >>>>>>>>>> >>> >> >> >> > >> > >> >>> >>>>>>>>>> -Flavio >>> >> >> >> > >> > >> >>> >>>>>>>>>> >>> >> >> >> > >> > >> >>> >>>>>>>>>> >>> >> >> >> > >> > >> >>> >>>>>>>>>> On 26 Jun 2014, at 01:07, Patrick Hunt >>> >> >> >> > >> > >> >>> >>>>>>>>>> <ph...@apache.org> >>> >> >> >> > >> > >> >>> wrote: >>> >> >> >> > >> > >> >>> >>>>>>>>>> >>> >> >> >> > >> > >> >>> >>>>>>>>>>> Hey folks, we've been talking about it >>> for >>> >> a >>> >> >> >> > while, a >>> >> >> >> > >> > >> >>> >>>>>>>>>>> few >>> >> >> >> > >> > >> >>> people >>> >> >> >> > >> > >> >>> >>>> have >>> >> >> >> > >> > >> >>> >>>>>>>>>>> mentioned on the list as well as >>> contacted >>> >> me >>> >> >> >> > >> > >> >>> >>>>>>>>>>> personally >>> >> >> >> > >> > >> that >>> >> >> >> > >> > >> >>> they >>> >> >> >> > >> > >> >>> >>>>>>>>>>> would like to see some progress on the >>> >> first >>> >> >> >> > >> > >> >>> >>>>>>>>>>> 3.5 >>> >> >> >> > >> > release. >>> >> >> >> > >> > >> Every >>> >> >> >> > >> > >> >>> >>>>>>>>>>> release is a compromise, if we wait for >>> >> >> >> > >> > >> >>> >>>>>>>>>>> perfection we'll >>> >> >> >> > >> > >> never >>> >> >> >> > >> > >> >>> get >>> >> >> >> > >> > >> >>> >>>>>>>>>>> anything out the door. 3.5 has tons of >>> >> great >>> >> >> >> > >> > >> >>> >>>>>>>>>>> new features, >>> >> >> >> > >> > >> >>> lots of >>> >> >> >> > >> > >> >>> >>>>>>>>>>> hard work, let's get it out in a >>> release so >>> >> >> >> > >> > >> >>> >>>>>>>>>>> that folks can >>> >> >> >> > >> > >> use >>> >> >> >> > >> > >> >>> it, >>> >> >> >> > >> > >> >>> >>>>>>>>>>> test it, and give feedback. >>> >> >> >> > >> > >> >>> >>>>>>>>>>> >>> >> >> >> > >> > >> >>> >>>>>>>>>>> Jenkins jobs have been pretty stable >>> except >>> >> >> >> > >> > >> >>> >>>>>>>>>>> for the known >>> >> >> >> > >> > >> >>> flakey >>> >> >> >> > >> > >> >>> >>>> test >>> >> >> >> > >> > >> >>> >>>>>>>>>>> ZOOKEEPER-1870 which Flavio committed >>> >> today to >>> >> >> >> > >> > trunk. >>> >> >> >> > >> > >> >>> >>>>>>>>>>> Note >>> >> >> >> > >> > >> that >>> >> >> >> > >> > >> >>> >>>>>>>>>>> jenkins has also been verifying the >>> code on >>> >> >> >> > >> > >> >>> >>>>>>>>>>> jdk7 >>> >> >> >> > and >>> >> >> >> > >> > jdk8. >>> >> >> >> > >> > >> >>> >>>>>>>>>>> >>> >> >> >> > >> > >> >>> >>>>>>>>>>> Here's my thinking again on how we >>> should >>> >> plan >>> >> >> >> > >> > >> >>> >>>>>>>>>>> our >>> >> >> >> > >> > >> releases: >>> >> >> >> > >> > >> >>> >>>>>>>>>>> >>> >> >> >> > >> > >> >>> >>>>>>>>>>> I don't think we'll be able to do a >>> >> >> >> > >> > >> >>> >>>>>>>>>>> 3.5.x-stable >>> >> >> >> > for >>> >> >> >> > >> > >> >>> >>>>>>>>>>> some >>> >> >> >> > >> > >> time. >>> >> >> >> > >> > >> >>> >>>> What I >>> >> >> >> > >> > >> >>> >>>>>>>>>>> think we should do instead is similar to >>> >> what >>> >> >> >> > >> > >> >>> >>>>>>>>>>> we >>> >> >> >> > did >>> >> >> >> > >> > >> >>> >>>>>>>>>>> for >>> >> >> >> > >> > >> 3.4. >>> >> >> >> > >> > >> >>> >>>> (this is >>> >> >> >> > >> > >> >>> >>>>>>>>>>> also similar to what Hadoop did during >>> >> their >>> >> >> >> > Hadoop 2 >>> >> >> >> > >> > >> release >>> >> >> >> > >> > >> >>> >>>> cycle) >>> >> >> >> > >> > >> >>> >>>>>>>>>>> Start with a series of alpha releases, >>> >> >> >> > >> > >> >>> >>>>>>>>>>> something people >>> >> >> >> > >> > >> can run >>> >> >> >> > >> > >> >>> >>>> and >>> >> >> >> > >> > >> >>> >>>>>>>>>>> test with, once we address all the >>> blockers >>> >> >> >> > >> > >> >>> >>>>>>>>>>> and >>> >> >> >> > feel >>> >> >> >> > >> > >> >>> comfortable >>> >> >> >> > >> > >> >>> >>>> with >>> >> >> >> > >> > >> >>> >>>>>>>>>>> the apis & remaining jiras we then >>> switch >>> >> to >>> >> >> >> beta. >>> >> >> >> > >> > >> >>> >>>>>>>>>>> Once we >>> >> >> >> > >> > >> get >>> >> >> >> > >> > >> >>> >>>> some >>> >> >> >> > >> > >> >>> >>>>>>>>>>> good feedback we remove the alpha/beta >>> >> moniker >>> >> >> >> > >> > and >>> >> >> >> > >> > >> >>> >>>>>>>>>>> look at >>> >> >> >> > >> > >> >>> making >>> >> >> >> > >> > >> >>> >>>> it >>> >> >> >> > >> > >> >>> >>>>>>>>>>> "stable'. At some later point it will >>> >> become >>> >> >> >> > >> > >> >>> >>>>>>>>>>> the >>> >> >> >> > >> > >> >>> "current/stable" >>> >> >> >> > >> > >> >>> >>>>>>>>>>> release, taking over from 3.4.x. >>> >> >> >> > >> > >> >>> >>>>>>>>>>> >>> >> >> >> > >> > >> >>> >>>>>>>>>>> e.g. >>> >> >> >> > >> > >> >>> >>>>>>>>>>> 3.5.0-alpha (8 blockers) 3.5.1-alpha (3 >>> >> >> >> > >> > >> >>> >>>>>>>>>>> blockers) 3.5.2-alpha (0 blockers) >>> >> 3.5.3-beta >>> >> >> >> > >> > >> >>> >>>>>>>>>>> (apis locked) 3.5.4-beta 3.5.5-beta >>> >> >> >> > >> > >> >>> >>>>>>>>>>> 3.5.6 (no longer considered alpha/beta >>> but >>> >> >> >> > >> > >> >>> >>>>>>>>>>> also not >>> >> >> >> > >> > >> "stable" vs >>> >> >> >> > >> > >> >>> >>>> 3.4.x, >>> >> >> >> > >> > >> >>> >>>>>>>>>>> maybe use it for production but we still >>> >> >> >> > >> > >> >>> >>>>>>>>>>> expect things to >>> >> >> >> > >> > >> shake >>> >> >> >> > >> > >> >>> >>>> out) >>> >> >> >> > >> > >> >>> >>>>>>>>>>> 3.5.7 >>> >> >> >> > >> > >> >>> >>>>>>>>>>> .... >>> >> >> >> > >> > >> >>> >>>>>>>>>>> 3.5.x - ready to replace 3.4 releases >>> for >>> >> >> >> > production >>> >> >> >> > >> > >> >>> >>>>>>>>>>> use, >>> >> >> >> > >> > >> >>> stable, >>> >> >> >> > >> > >> >>> >>>>>>> etc... >>> >> >> >> > >> > >> >>> >>>>>>>>>>> >>> >> >> >> > >> > >> >>> >>>>>>>>>>> There are 8 blockers currently, are any >>> of >>> >> >> >> > >> > >> >>> >>>>>>>>>>> these something >>> >> >> >> > >> > >> that >>> >> >> >> > >> > >> >>> >>>> should >>> >> >> >> > >> > >> >>> >>>>>>>>>>> hold up 3.5.0-alpha? >>> >> >> >> > >> > >> >>> >>>>>>>>>>> >>> >> >> >> > >> > >> >>> >>>>>>>>>>> I'll hold open the discussion for a >>> couple >>> >> >> >> > >> > >> >>> >>>>>>>>>>> days. If folks >>> >> >> >> > >> > >> find >>> >> >> >> > >> > >> >>> >>>> this a >>> >> >> >> > >> > >> >>> >>>>>>>>>>> reasonable plan I'll start the ball >>> >> rolling to >>> >> >> >> > >> > >> >>> >>>>>>>>>>> cut >>> >> >> >> > an >>> >> >> >> > >> RC. >>> >> >> >> > >> > >> >>> >>>>>>>>>>> >>> >> >> >> > >> > >> >>> >>>>>>>>>>> Patrick >>> >> >> >> > >> > >> >>> >>>>>>>>>> >>> >> >> >> > >> > >> >>> >>>>>>>>> >>> >> >> >> > >> > >> >>> >>>>>>>>> >>> >> >> >> > >> > >> >>> >>>>>>>>> >>> >> >> >> > >> > >> >>> >>>>>>> >>> >> >> >> > >> > >> >>> >>>>>> >>> >> >> >> > >> > >> >>> >>>>>> >>> >> >> >> > >> > >> >>> >>>>>> >>> >> >> >> > >> > >> >>> > >>> >> >> >> > >> > >> >>> >>> >> >> >> > >> > >> >>> >> >> >> > >> >>> >> >> >> > >> >>> >> >> >> > >>> >> >> >> >>> >> >> >>> >> >>>