yep, I think what happens is that server 3 is becoming leader and not server 1, so its not completing the reconfig. Let me think about how to solve this...
On Tue, Jul 22, 2014 at 12:21 PM, Patrick Hunt <ph...@apache.org> wrote: > Also if you want to submit a patch that provides more insight (logs) > for that operation/test lmk and I'll be happy to review/commit it. > Should help with debugging the issue and debugging in the field. > > Thanks! > > Patrick > > On Tue, Jul 22, 2014 at 12:17 PM, Patrick Hunt <ph...@apache.org> wrote: > > Here's the logs (attached) for the test that failed. Nothing stuck out > > at me - anything ring a bell? > > > > Patrick > > > > On Tue, Jul 22, 2014 at 12:10 PM, Alexander Shraer <shra...@gmail.com> > wrote: > >> Unfortunately doesn't look like we have enough logging going on there. > >> For example would be nice to know what's the committed config and last > seen > >> config > >> of the leader when it comes up (leader.lead()). and what configuration > is > >> sent in the NEWLEADER message > >> sent out in LeaderHandler: > >> > >> QuorumPacket newLeaderQP = new > >> QuorumPacket(Leader.NEWLEADER, > >> newLeaderZxid, > >> leader.self.getLastSeenQuorumVerifier() > >> .toString().getBytes(), null); > >> > >> > >> I didn't know about the option to have a separate administrative > interface, > >> and just followed the flow of other commands... I agree that it would be > >> cleaner. > >> > >> > >> > >> > >> On Tue, Jul 22, 2014 at 11:36 AM, Patrick Hunt <ph...@apache.org> > wrote: > >> > >>> On Tue, Jul 22, 2014 at 11:29 AM, Alexander Shraer <shra...@gmail.com> > >>> wrote: > >>> > Hmm. It doesn't really make sense to me - the reconfig should be > >>> completed > >>> > before > >>> > the servers come up and process new ops. We submitted the reconfig to > >>> > server 1, it timed out > >>> > on new quorum, but when 1 becomes leader again after 2 restarts 1 > should > >>> > complete the reconfig. > >>> > is 1 becoming leader after 2 restarts ? > >>> > > >>> > >>> What should I look for in the logs? Any specific log messages that > >>> would help debug? > >>> > >>> > About admin controls - reconfig/getConfig are open to everyone, > unless > >>> you > >>> > set permissions on the configuration znode being written during > reconfig. > >>> > nodeRecord = getRecordForPath(ZooDefs.CONFIG_NODE); > >>> > > >>> > checkACL(zks, nodeRecord.acl, ZooDefs.Perms.WRITE, > >>> > request.authInfo); > >>> > > >>> > >>> So I can turn off all access then? (read and write). Should we ship > >>> that as the default? We should add that to the docs. > >>> > >>> In the past we've always tried to hide this type of information from > >>> clients (e.g. we don't expose the zk server address to the client for > >>> a session). This seems like a very big departure. Why didn't we move > >>> it to a separate, administrative, interface? > >>> > >>> Patrick > >>> > >>> > > >>> > > >>> > On Tue, Jul 22, 2014 at 11:16 AM, Patrick Hunt <phu...@gmail.com> > wrote: > >>> > > >>> >> Looks like 3 hasn't been removed (unfortunately the assertion > doesn't > >>> >> include any msg detail, but that's the way it looks to me like the > >>> >> test is setup): > >>> >> > >>> >> if (leavingServers != null) { > >>> >> for (String leaving : leavingServers) > >>> >> > >>> >> Assert.assertFalse(configStr.contains("server.".concat(leaving))); > >>> >> } > >>> >> > >>> >> which is called from: > >>> >> > >>> >> qu.restart(2); > >>> >> // Now that 2 is back up, they'll complete the reconfig > >>> removing 3 > >>> >> and > >>> >> // can process other ops. > >>> >> testServerHasConfig(zkArr[1], null, leavingServers); > >>> >> > >>> >> It seems like the problem is that testServerHasConfig is not waiting > >>> >> for the configuration to be updated? In this case 2 was just > restarted > >>> >> and 3 hasn't had a chance to be removed? (on a slower machine say, > >>> >> which might be why you aren't seeing the issue? hence the > flakeyness) > >>> >> > >>> >> Patrick > >>> >> > >>> >> On Tue, Jul 22, 2014 at 10:57 AM, Alexander Shraer < > shra...@gmail.com> > >>> >> wrote: > >>> >> > Hi Patrick, I'm not sure why you're seeing this - it consistently > >>> passes > >>> >> on > >>> >> > my machine. In case you'd like to take a look, the test has tons > of > >>> >> > comments explaining the scenario. Let me know how I can help. > >>> >> > > >>> >> > > >>> >> > On Tue, Jul 22, 2014 at 9:53 AM, Patrick Hunt <ph...@apache.org> > >>> wrote: > >>> >> > > >>> >> >> Hi Alex, I've also seen the test "testLeaderTimesoutOnNewQuorum" > fail > >>> >> >> multiple times (not every time, but ~50%, so flakey) in the last > few > >>> >> >> days. It's failing both on jdk6 and jdk7. (this is my personal > >>> >> >> jenkins, I haven't see any other failures than this during the > past > >>> >> >> few days). > >>> >> >> > >>> >> >> junit.framework.AssertionFailedError > >>> >> >> at > >>> >> >> > >>> >> > >>> > org.apache.zookeeper.test.ReconfigTest.testServerHasConfig(ReconfigTest.java:127) > >>> >> >> at > >>> >> >> > >>> >> > >>> > org.apache.zookeeper.test.ReconfigTest.testLeaderTimesoutOnNewQuorum(ReconfigTest.java:450) > >>> >> >> at > >>> >> >> > >>> >> > >>> > org.apache.zookeeper.JUnit4ZKTestRunner$LoggedInvokeMethod.evaluate(JUnit4ZKTestRunner.java:52) > >>> >> >> > >>> >> >> Patrick > >>> >> >> > >>> >> >> On Tue, Jul 22, 2014 at 8:37 AM, Alexander Shraer < > shra...@gmail.com > >>> > > >>> >> >> wrote: > >>> >> >> > Hi Rakesh, > >>> >> >> > > >>> >> >> > Thanks for looking at this. In general even if we find the bug > >>> since > >>> >> we > >>> >> >> > should test it before committing a fix, it seems better to > remove > >>> the > >>> >> >> test > >>> >> >> > for now and debug this on a build machine. I'm trying to get > >>> access to > >>> >> >> it. > >>> >> >> > > >>> >> >> > Looking at this log: > >>> >> >> > > >>> >> >> > >>> >> > >>> > https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeeper-trunk/2380/testReport/org.apache.zookeeper.server.quorum/ReconfigRecoveryTest/testCurrentObserverIsParticipantInNewConfig/ > >>> >> >> > > >>> >> >> > Something weird is going on. Sever 3 hasn't started yet, but > >>> version > >>> >> >> 200000000 > >>> >> >> > is already being sent around as committed! > >>> >> >> > > >>> >> >> > 2014-07-21 10:44:50,901 [myid:2] - INFO > >>> >> >> > > >>> >> > [WorkerReceiver[myid=2]:FastLeaderElection$Messenger$WorkerReceiver@293 > >>> ] > >>> >> >> > - 2 Received version: 200000000 my version: 0 > >>> >> >> > > >>> >> >> > > >>> >> >> > and also in leader election messages. > >>> >> >> > > >>> >> >> > Also weird is that the version of 2 is 0 as if it is a joiner, > >>> >> whereas we > >>> >> >> > explicitly started it with 100000000. > >>> >> >> > Then it makes sense that the new config can't be committed > since > >>> its > >>> >> >> > version is not high enough... > >>> >> >> > > >>> >> >> > I wonder if its possible that not all servers from the previous > >>> test > >>> >> are > >>> >> >> > dead and they are interfering... > >>> >> >> > > >>> >> >> > > >>> >> >> > On Tue, Jul 22, 2014 at 3:53 AM, Rakesh R <rake...@huawei.com> > >>> wrote: > >>> >> >> > > >>> >> >> >> Hi Alex, > >>> >> >> >> > >>> >> >> >> Yeah it is consistently passing in my machine also. > >>> >> >> >> > >>> >> >> >> > >>> >> >> >> I have quickly gone through the > >>> >> >> >> testCurrentObserverIsParticipantInNewConfig failure logs in > >>> >> >> >> PreCommit-ZOOKEEPER-Build. It looks like 200000000 (n.config > >>> version) > >>> >> >> has > >>> >> >> >> not taken and still leader election is seeing 100000000 > (n.config > >>> >> >> version). > >>> >> >> >> Unfortunately I didn't find the reason for not considering the > >>> >> updated > >>> >> >> >> config version. > >>> >> >> >> > >>> >> >> >> > >>> >> >> >> Reference: > >>> >> >> >> > >>> >> >> > >>> >> > >>> > https://builds.apache.org/job/PreCommit-ZOOKEEPER-Build/2213/testReport/junit/org.apache.zookeeper.server.quorum/ReconfigRecoveryTest/testCurrentObserverIsParticipantInNewConfig > >>> >> >> >> > >>> >> >> >> 2014-07-22 06:38:00,330 [myid:1] - INFO > >>> >> >> >> [QuorumPeer[myid=1]/127.0.0.1:11298:FastLeaderElection@922] > - > >>> >> >> >> Notification time out: 51200 > >>> >> >> >> 2014-07-22 06:38:00,330 [myid:1] - INFO > >>> >> >> >> [WorkerReceiver[myid=1]:FastLeaderElection@682] - > Notification: > >>> 2 > >>> >> >> >> (message format version), 1 (n.leader), 0x100000005 (n.zxid), > 0x1 > >>> >> >> >> (n.round), LOOKING (n.state), 1 (n.sid), 0x1 (n.peerEPoch), > >>> LOOKING > >>> >> (my > >>> >> >> >> state)100000000 (n.config version) > >>> >> >> >> 2014-07-22 06:38:00,331 [myid:2] - INFO > >>> >> >> >> [WorkerReceiver[myid=2]:FastLeaderElection@682] - > Notification: > >>> 2 > >>> >> >> >> (message format version), 1 (n.leader), 0x100000005 (n.zxid), > 0x1 > >>> >> >> >> (n.round), LOOKING (n.state), 2 (n.sid), 0x1 (n.peerEPoch), > >>> LOOKING > >>> >> (my > >>> >> >> >> state)100000000 (n.config version) > >>> >> >> >> 2014-07-22 06:38:00,330 [myid:2] - INFO > >>> >> >> >> [QuorumPeer[myid=2]/127.0.0.1:11301:FastLeaderElection@922] > - > >>> >> >> >> Notification time out: 51200 > >>> >> >> >> 2014-07-22 06:38:00,331 [myid:0] - INFO > >>> >> >> >> [WorkerReceiver[myid=0]:FastLeaderElection@682] - > Notification: > >>> 2 > >>> >> >> >> (message format version), 1 (n.leader), 0x100000005 (n.zxid), > 0x1 > >>> >> >> >> (n.round), LOOKING (n.state), 1 (n.sid), 0x1 (n.peerEPoch), > >>> LOOKING > >>> >> (my > >>> >> >> >> state)100000000 (n.config version) > >>> >> >> >> 2014-07-22 06:38:00,331 [myid:2] - INFO > >>> >> >> >> [WorkerReceiver[myid=2]:FastLeaderElection@682] - > Notification: > >>> 2 > >>> >> >> >> (message format version), 1 (n.leader), 0x100000005 (n.zxid), > 0x1 > >>> >> >> >> (n.round), LOOKING (n.state), 1 (n.sid), 0x1 (n.peerEPoch), > >>> LOOKING > >>> >> (my > >>> >> >> >> state)100000000 (n.config version) > >>> >> >> >> > >>> >> >> >> > >>> >> >> >> 2014-07-22 06:38:00,332 [myid:0] - INFO > >>> >> >> >> [WorkerReceiver[myid=0]:FastLeaderElection@682] - > Notification: > >>> 2 > >>> >> >> >> (message format version), 1 (n.leader), 0x100000005 (n.zxid), > 0x1 > >>> >> >> >> (n.round), LOOKING (n.state), 2 (n.sid), 0x1 (n.peerEPoch), > >>> LOOKING > >>> >> (my > >>> >> >> >> state)100000000 (n.config version) > >>> >> >> >> 2014-07-22 06:38:00,332 [myid:1] - INFO > >>> >> >> >> [WorkerReceiver[myid=1]:FastLeaderElection@682] - > Notification: > >>> 2 > >>> >> >> >> (message format version), 1 (n.leader), 0x100000005 (n.zxid), > 0x1 > >>> >> >> >> (n.round), LOOKING (n.state), 2 (n.sid), 0x1 (n.peerEPoch), > >>> LOOKING > >>> >> (my > >>> >> >> >> state)100000000 (n.config version) > >>> >> >> >> > >>> >> >> >> > >>> >> >> >> -Rakesh > >>> >> >> >> > >>> >> >> >> -----Original Message----- > >>> >> >> >> From: Alexander Shraer [mailto:shra...@gmail.com] > >>> >> >> >> Sent: 22 July 2014 11:57 > >>> >> >> >> To: dev@zookeeper.apache.org > >>> >> >> >> Subject: Re: ZooKeeper 3.5.0-alpha planning > >>> >> >> >> > >>> >> >> >> I tried to look into it, but the test consistently passes > locally > >>> on > >>> >> two > >>> >> >> >> machines. > >>> >> >> >> I don't currently have access to the build machine, but I can > try > >>> to > >>> >> ask > >>> >> >> >> for access. > >>> >> >> >> Unless anyone has a better suggestion, we could remove the > failing > >>> >> test > >>> >> >> in > >>> >> >> >> the meanwhile and open a JIRA to add it back... > >>> >> >> >> > >>> >> >> >> > >>> >> >> >> On Mon, Jul 21, 2014 at 10:09 PM, Patrick Hunt < > ph...@apache.org> > >>> >> >> wrote: > >>> >> >> >> > >>> >> >> >> > I'm seeing alot of test failures in > >>> >> >> >> > testCurrentObserverIsParticipantInNewConfig could someone > take a > >>> >> look? > >>> >> >> >> > Seems related to ZOOKEEPER-1807 recent commit. > >>> >> >> >> > > >>> >> >> >> > > >>> >> >> >> > > >>> >> >> > >>> https://issues.apache.org/jira/browse/ZOOKEEPER-1807?focusedCommentId= > >>> >> >> >> > > >>> >> > 14069024&page=com.atlassian.jira.plugin.system.issuetabpanels:comment- > >>> >> >> >> > tabpanel#comment-14069024 > >>> >> >> >> > > >>> >> >> >> > Patrick > >>> >> >> >> > > >>> >> >> >> > On Mon, Jul 21, 2014 at 11:12 AM, Rakesh Radhakrishnan > >>> >> >> >> > <rakeshr.apa...@gmail.com> wrote: > >>> >> >> >> > > lgtm +1 > >>> >> >> >> > > > >>> >> >> >> > > > >>> >> >> >> > > On Mon, Jul 21, 2014 at 11:37 PM, FPJ > >>> >> >> >> > > <fpjunque...@yahoo.com.invalid> > >>> >> >> >> > wrote: > >>> >> >> >> > > > >>> >> >> >> > >> +1 for having an RC this week. Since this is an alpha > >>> release, I > >>> >> >> >> > >> +think > >>> >> >> >> > 72 > >>> >> >> >> > >> biz hours is enough for the vote. > >>> >> >> >> > >> > >>> >> >> >> > >> -Flavio > >>> >> >> >> > >> > >>> >> >> >> > >> > -----Original Message----- > >>> >> >> >> > >> > From: Patrick Hunt [mailto:ph...@apache.org] > >>> >> >> >> > >> > Sent: 21 July 2014 18:55 > >>> >> >> >> > >> > To: DevZooKeeper > >>> >> >> >> > >> > Subject: Re: ZooKeeper 3.5.0-alpha planning > >>> >> >> >> > >> > > >>> >> >> >> > >> > I fixed a number of issues. I also started a few > threads > >>> with > >>> >> >> >> > >> > builds@ > >>> >> >> >> > >> > - the ulimit issue is still outstanding. Hongchao and I > >>> worked > >>> >> >> >> > through a > >>> >> >> >> > >> > number of findbugs issues, it's not closed yet but it's > >>> pretty > >>> >> >> >> close. > >>> >> >> >> > >> > > >>> >> >> >> > >> > I don't see why we can't create an RC and start voting > this > >>> >> week > >>> >> >> >> > though. > >>> >> >> >> > >> > Anyone disagree? > >>> >> >> >> > >> > > >>> >> >> >> > >> > How long should we let the vote run, the std 72 biz > hours > >>> or > >>> >> >> >> > >> > should we > >>> >> >> >> > >> plan > >>> >> >> >> > >> > for more to allow folks more time to test? > >>> >> >> >> > >> > > >>> >> >> >> > >> > Patrick > >>> >> >> >> > >> > > >>> >> >> >> > >> > On Mon, Jul 21, 2014 at 10:29 AM, Raúl Gutiérrez > Segalés > >>> >> >> >> > >> > <r...@itevenworks.net> wrote: > >>> >> >> >> > >> > > On 18 July 2014 10:32, Patrick Hunt < > ph...@apache.org> > >>> >> wrote: > >>> >> >> >> > >> > > > >>> >> >> >> > >> > >> You may notice some back/forth on Apache Jenkins ZK > >>> jobs - > >>> >> I'm > >>> >> >> >> > trying > >>> >> >> >> > >> > >> to fix some of the jobs that were broken during the > >>> recent > >>> >> >> >> > >> > >> host upgrade. > >>> >> >> >> > >> > >> > >>> >> >> >> > >> > > > >>> >> >> >> > >> > > How are things looking? Is it likely that we can > have a > >>> >> 3.5.0 > >>> >> >> >> > >> > > alpha release week or are we still blocked on > Jenkins? > >>> >> >> >> > >> > > > >>> >> >> >> > >> > > > >>> >> >> >> > >> > > -rgs > >>> >> >> >> > >> > > > >>> >> >> >> > >> > > > >>> >> >> >> > >> > > > >>> >> >> >> > >> > > > >>> >> >> >> > >> > > > >>> >> >> >> > >> > > > >>> >> >> >> > >> > >> Patrick > >>> >> >> >> > >> > >> > >>> >> >> >> > >> > >> On Thu, Jul 17, 2014 at 1:47 PM, Michi Mutsuzaki > >>> >> >> >> > >> > >> <mi...@cs.stanford.edu> > >>> >> >> >> > >> > >> wrote: > >>> >> >> >> > >> > >> > I'll check in ZOOKEEPER-1683. > >>> >> >> >> > >> > >> > > >>> >> >> >> > >> > >> > On Thu, Jul 17, 2014 at 11:20 AM, Alexander Shraer > >>> >> >> >> > >> > >> > <shra...@gmail.com> > >>> >> >> >> > >> > >> wrote: > >>> >> >> >> > >> > >> >> can we also have ZOOKEEPER-1683 in ? Camille > gave a > >>> +1 > >>> >> and > >>> >> >> >> > >> > >> >> all > >>> >> >> >> > >> > >> subsequent > >>> >> >> >> > >> > >> >> changes were formatting as suggested by Rakesh. > >>> >> >> >> > >> > >> >> > >>> >> >> >> > >> > >> >> > >>> >> >> >> > >> > >> >> On Thu, Jul 17, 2014 at 9:48 AM, Patrick Hunt > >>> >> >> >> > >> > >> >> <ph...@apache.org > >>> >> >> >> > > > >>> >> >> >> > >> > wrote: > >>> >> >> >> > >> > >> >> > >>> >> >> >> > >> > >> >>> I'm concerned that the CI tests are all failing > due > >>> to, > >>> >> >> >> > >> > >> >>> for > >>> >> >> >> > e.g. > >>> >> >> >> > >> > >> >>> findbugs issues. At the very least our > build/test/ci > >>> >> >> >> > >> > >> >>> should be pretty clean - some flakeys is ok (the > >>> recent > >>> >> >> >> > >> > >> >>> startServer fix > >>> >> >> >> > and > >>> >> >> >> > >> > >> >>> some other flakeys that have been addressed go a > >>> long > >>> >> way > >>> >> >> >> > >> > >> >>> on > >>> >> >> >> > that > >>> >> >> >> > >> > >> >>> issue) but I think the findbugs problem should > be > >>> >> cleaned > >>> >> >> >> > >> > >> >>> up before we cut a release. I started a separate > >>> >> thread to > >>> >> >> >> > >> > >> >>> discuss > >>> >> >> >> > >> the > >>> >> >> >> > >> > findbugs issue. > >>> >> >> >> > >> > >> >>> > >>> >> >> >> > >> > >> >>> Otw we seem to be in ok shape - 1863 is in. > >>> >> >> >> > >> > >> >>> > >>> >> >> >> > >> > >> >>> Anyone have a chance to give feedback to Raul on > >>> 1919? > >>> >> >> >> > >> > >> >>> > >>> >> >> >> > >> > >> >>> Patrick > >>> >> >> >> > >> > >> >>> > >>> >> >> >> > >> > >> >>> On Tue, Jul 15, 2014 at 10:34 AM, Flavio > Junqueira > >>> >> >> >> > >> > >> >>> <fpjunque...@yahoo.com.invalid> wrote: > >>> >> >> >> > >> > >> >>> > My take: > >>> >> >> >> > >> > >> >>> > > >>> >> >> >> > >> > >> >>> > - ZK-1863 is pending review. It is a blocker > and > >>> it > >>> >> can > >>> >> >> >> > >> > >> >>> > go > >>> >> >> >> > in. > >>> >> >> >> > >> > >> >>> > See > >>> >> >> >> > >> > >> the > >>> >> >> >> > >> > >> >>> jira for comments. > >>> >> >> >> > >> > >> >>> > - We can try to have ZK-1807 in for the first > >>> alpha. > >>> >> >> >> > >> > >> >>> > - I'd rather not have the first alpha > depending on > >>> >> >> >> > >> > >> >>> > ZK-1919 > >>> >> >> >> > and > >>> >> >> >> > >> > >> ZK-1910, > >>> >> >> >> > >> > >> >>> we can leave it for the second alpha. > >>> >> >> >> > >> > >> >>> > > >>> >> >> >> > >> > >> >>> > If you agree with this, then we should be > able to > >>> >> cut a > >>> >> >> >> > >> > >> >>> > candidate by > >>> >> >> >> > >> > >> the > >>> >> >> >> > >> > >> >>> end of this week. > >>> >> >> >> > >> > >> >>> > > >>> >> >> >> > >> > >> >>> > -Flavio > >>> >> >> >> > >> > >> >>> > > >>> >> >> >> > >> > >> >>> > On 15 Jul 2014, at 17:26, Patrick Hunt > >>> >> >> >> > >> > >> >>> > <ph...@apache.org> > >>> >> >> >> > >> wrote: > >>> >> >> >> > >> > >> >>> > > >>> >> >> >> > >> > >> >>> >> Per my previous note you can now see the c > client > >>> >> test > >>> >> >> >> > >> > >> >>> >> log output > >>> >> >> >> > >> > >> here > >>> >> >> >> > >> > >> >>> >> in the "build artifacts" section: > >>> >> >> >> > >> > >> >>> >> > >>> >> >> >> > >> > >> >>> > >>> >> >> >> > >> > >> > >>> >> >> https://builds.apache.org/view/S-Z/view/ZooKeeper/job/ZooKeepe > >>> >> >> >> > >> > >> r- > >>> >> >> >> > >> > trunk > >>> >> >> >> > >> > >> /2372/ > >>> >> >> >> > >> > >> >>> >> > >>> >> >> >> > >> > >> >>> >> Patrick > >>> >> >> >> > >> > >> >>> >> > >>> >> >> >> > >> > >> >>> >> On Mon, Jul 14, 2014 at 7:36 PM, Patrick Hunt > >>> >> >> >> > >> > >> >>> >> <ph...@apache.org> > >>> >> >> >> > >> > >> wrote: > >>> >> >> >> > >> > >> >>> >>> Update: we're back to 8 blockers on 3.5.0 > (not > >>> >> clear > >>> >> >> >> > >> > >> >>> >>> to me which > >>> >> >> >> > >> > >> >>> >>> one(s?) is new?) > >>> >> >> >> > >> > >> >>> >>> > >>> >> >> >> > >> > >> >>> >>> Looks like the autoconf issue I reported is > >>> hitting > >>> >> >> >> > >> > >> >>> >>> the upgraded apache jenkins instances as > well. > >>> I've > >>> >> >> >> > >> > >> >>> >>> updated the "archive" list > >>> >> >> >> > >> > >> to > >>> >> >> >> > >> > >> >>> >>> include the c tests stdout redirect. So > while it > >>> >> won't > >>> >> >> >> > >> > >> >>> >>> go > >>> >> >> >> > to > >>> >> >> >> > >> > >> console > >>> >> >> >> > >> > >> >>> >>> at least we can debug when there is a > failure. > >>> >> >> >> > >> > >> >>> >>> > >>> >> >> >> > >> > >> >>> >>> Raul has been helping Bill with reviews for > the > >>> >> jetty > >>> >> >> >> > server > >>> >> >> >> > >> > >> support > >>> >> >> >> > >> > >> >>> >>> and it looks like that should be ready soon. > >>> >> >> >> > >> > >> >>> >>> > >>> >> >> >> > >> > >> >>> >>> Raul also requested that someone prioritize > >>> >> reviewing > >>> >> >> >> > >> > >> "ZOOKEEPER-1919 > >>> >> >> >> > >> > >> >>> >>> Update the C implementation of > removeWatches to > >>> >> have > >>> >> >> >> > >> > >> >>> >>> it > >>> >> >> >> > >> > match > >>> >> >> >> > >> > >> >>> >>> ZOOKEEPER-1910" so that we can include it in > >>> 3.5.0. > >>> >> >> >> > >> Flavio/Michi? > >>> >> >> >> > >> > >> >>> >>> > >>> >> >> >> > >> > >> >>> >>> Hongchao got a patch in to cleanup the > flakey c > >>> >> client > >>> >> >> >> > >> > >> >>> >>> reconfig > >>> >> >> >> > >> > >> test - > >>> >> >> >> > >> > >> >>> >>> kudos on helping cleanup the build/test > infra! > >>> >> >> >> > >> > >> >>> >>> > >>> >> >> >> > >> > >> >>> >>> > >>> >> >> >> > >> > >> >>> >>> Based on previous comments it looks like > we're > >>> >> pretty > >>> >> >> >> > close. > >>> >> >> >> > >> > >> >>> >>> Do > >>> >> >> >> > >> > >> folks > >>> >> >> >> > >> > >> >>> >>> feel comfortable with a 3.5.0 alpha at this > >>> point? > >>> >> >> >> > >> > >> >>> >>> (with a few > >>> >> >> >> > >> > >> pending > >>> >> >> >> > >> > >> >>> >>> as above) > >>> >> >> >> > >> > >> >>> >>> > >>> >> >> >> > >> > >> >>> >>> Patrick > >>> >> >> >> > >> > >> >>> >>> > >>> >> >> >> > >> > >> >>> >>> On Fri, Jul 11, 2014 at 9:24 AM, Raúl > Gutiérrez > >>> >> >> >> > >> > >> >>> >>> Segalés <r...@itevenworks.net> wrote: > >>> >> >> >> > >> > >> >>> >>>> On Jul 11, 2014 6:37 AM, "Flavio Junqueira" > >>> >> >> >> > >> > >> >>> <fpjunque...@yahoo.com.invalid> > >>> >> >> >> > >> > >> >>> >>>> wrote: > >>> >> >> >> > >> > >> >>> >>>>> > >>> >> >> >> > >> > >> >>> >>>>> Just so that we don´t delay too much, > what if > >>> we > >>> >> >> >> > >> > >> >>> >>>>> release > >>> >> >> >> > an > >>> >> >> >> > >> > >> >>> >>>>> alpha > >>> >> >> >> > >> > >> >>> version > >>> >> >> >> > >> > >> >>> >>>> without 1863 and 1807, and do another one > in > >>> 2-3 > >>> >> >> >> > >> > >> >>> >>>> weeks > >>> >> >> >> > time? > >>> >> >> >> > >> > >> >>> >>>>> > >>> >> >> >> > >> > >> >>> >>>> > >>> >> >> >> > >> > >> >>> >>>> +1 > >>> >> >> >> > >> > >> >>> >>>> > >>> >> >> >> > >> > >> >>> >>>> -rgs > >>> >> >> >> > >> > >> >>> >>>> > >>> >> >> >> > >> > >> >>> >>>>> -Flavio > >>> >> >> >> > >> > >> >>> >>>>> > >>> >> >> >> > >> > >> >>> >>>>> > >>> >> >> >> > >> > >> >>> >>>>> On Thursday, July 3, 2014 6:12 AM, Raúl > >>> Gutiérrez > >>> >> >> >> > Segalés < > >>> >> >> >> > >> > >> >>> >>>> r...@itevenworks.net> wrote: > >>> >> >> >> > >> > >> >>> >>>>> > >>> >> >> >> > >> > >> >>> >>>>> > >>> >> >> >> > >> > >> >>> >>>>>> > >>> >> >> >> > >> > >> >>> >>>>>> > >>> >> >> >> > >> > >> >>> >>>>>> On 2 July 2014 21:19, Patrick Hunt > >>> >> >> >> > >> > >> >>> >>>>>> <ph...@apache.org> > >>> >> >> >> > >> > wrote: > >>> >> >> >> > >> > >> >>> >>>>>> > >>> >> >> >> > >> > >> >>> >>>>>>> Update: we're down to 7 blockers on > 5.1.0 > >>> >> (from 8 > >>> >> >> >> > >> > >> >>> >>>>>>> in > >>> >> >> >> > the > >>> >> >> >> > >> > >> >>> >>>>>>> last > >>> >> >> >> > >> > >> >>> check). > >>> >> >> >> > >> > >> >>> >>>>>>> 1810 is waiting on feedback from Michi, > and > >>> >> >> >> > >> > >> >>> >>>>>>> Camille is > >>> >> >> >> > >> > >> threatening > >>> >> >> >> > >> > >> >>> to > >>> >> >> >> > >> > >> >>> >>>>>>> commit 1863. I see some great progress > in > >>> >> general > >>> >> >> >> > >> > >> >>> >>>>>>> on > >>> >> >> >> > the > >>> >> >> >> > >> > >> >>> >>>>>>> patch availables queue, which is great > to > >>> see. > >>> >> >> >> > >> > >> >>> >>>>>>> > >>> >> >> >> > >> > >> >>> >>>>>>> So here's something else we might > consider - > >>> >> >> >> > >> > >> >>> >>>>>>> should we drop > >>> >> >> >> > >> > >> jdk6 > >>> >> >> >> > >> > >> >>> >>>>>>> support from 3.5. It's long since EOL by > >>> Oracle > >>> >> >> >> > >> > >> >>> >>>>>>> but I suspect > >>> >> >> >> > >> > >> some > >>> >> >> >> > >> > >> >>> >>>>>>> folks are still using ZK with 6. We > gotta > >>> move > >>> >> >> >> > >> > >> >>> >>>>>>> forward though, > >>> >> >> >> > >> > >> >>> can't > >>> >> >> >> > >> > >> >>> >>>>>>> support it forever. Thoughts? Note that > we > >>> are > >>> >> >> >> > currently > >>> >> >> >> > >> > >> >>> >>>>>>> building/testing trunk against jdk6, 7 > and > >>> 8. > >>> >> >> >> > >> > >> >>> >>>>>>> > >>> >> >> https://builds.apache.org/view/S-Z/view/ZooKeeper/ > >>> >> >> >> > >> > >> >>> >>>>>>> > >>> >> >> >> > >> > >> >>> >>>>>> > >>> >> >> >> > >> > >> >>> >>>>>> Extra eyes/review for > >>> >> >> >> > >> > >> >>> >>>> > >>> >> https://issues.apache.org/jira/browse/ZOOKEEPER-1807 > >>> >> >> >> > >> > >> >>> >>>>>> would be appreciated (otherwise anyone > using > >>> >> >> >> > >> > >> >>> >>>>>> Observers with the > >>> >> >> >> > >> > >> >>> upcoming > >>> >> >> >> > >> > >> >>> >>>>>> alpha release will see there network > usage go > >>> >> >> >> wild...). > >>> >> >> >> > >> > >> >>> >>>>>> > >>> >> >> >> > >> > >> >>> >>>>>> > >>> >> >> >> > >> > >> >>> >>>>>> -rgs > >>> >> >> >> > >> > >> >>> >>>>>> > >>> >> >> >> > >> > >> >>> >>>>>> > >>> >> >> >> > >> > >> >>> >>>>>> > >>> >> >> >> > >> > >> >>> >>>>>> > >>> >> >> >> > >> > >> >>> >>>>>> > >>> >> >> >> > >> > >> >>> >>>>>>> Patrick > >>> >> >> >> > >> > >> >>> >>>>>>> > >>> >> >> >> > >> > >> >>> >>>>>>> On Tue, Jul 1, 2014 at 2:26 AM, Flavio > >>> >> Junqueira > >>> >> >> >> > >> > >> >>> >>>>>>> <fpjunque...@yahoo.com.invalid> wrote: > >>> >> >> >> > >> > >> >>> >>>>>>>> According to me, ZK-1810 should be in > >>> already, > >>> >> >> >> > >> > >> >>> >>>>>>>> but I need a +1 > >>> >> >> >> > >> > >> >>> >>>> there. I > >>> >> >> >> > >> > >> >>> >>>>>>> think Michi hasn't checked in because > LETest > >>> >> >> >> > >> > >> >>> >>>>>>> failed in the > >>> >> >> >> > >> > >> last QA > >>> >> >> >> > >> > >> >>> run > >>> >> >> >> > >> > >> >>> >>>>>>> there. However, that patch doesn't > affect > >>> >> LETest, > >>> >> >> >> > >> > >> >>> >>>>>>> and > >>> >> >> >> > in > >>> >> >> >> > >> > >> >>> >>>>>>> fact > >>> >> >> >> > >> > >> it > >>> >> >> >> > >> > >> >>> fails > >>> >> >> >> > >> > >> >>> >>>> in > >>> >> >> >> > >> > >> >>> >>>>>>> trunk intermittently, so the test > failure > >>> >> doesn't > >>> >> >> >> > >> > >> >>> >>>>>>> seem > >>> >> >> >> > to > >>> >> >> >> > >> > >> >>> >>>>>>> be > >>> >> >> >> > >> > >> >>> related > >>> >> >> >> > >> > >> >>> >>>> to the > >>> >> >> >> > >> > >> >>> >>>>>>> patch. > >>> >> >> >> > >> > >> >>> >>>>>>>> > >>> >> >> >> > >> > >> >>> >>>>>>>> I haven't checked ZK-1863, so I can't > say > >>> >> >> >> > >> > >> >>> >>>>>>>> anything concrete > >>> >> >> >> > >> > >> about > >>> >> >> >> > >> > >> >>> it. > >>> >> >> >> > >> > >> >>> >>>>>>>> > >>> >> >> >> > >> > >> >>> >>>>>>>> -Flavio > >>> >> >> >> > >> > >> >>> >>>>>>>> > >>> >> >> >> > >> > >> >>> >>>>>>>> > >>> >> >> >> > >> > >> >>> >>>>>>>> > >>> >> >> >> > >> > >> >>> >>>>>>>> On Tuesday, July 1, 2014 5:53 AM, > Patrick > >>> >> Hunt < > >>> >> >> >> > >> > >> ph...@apache.org> > >>> >> >> >> > >> > >> >>> >>>> wrote: > >>> >> >> >> > >> > >> >>> >>>>>>>> > >>> >> >> >> > >> > >> >>> >>>>>>>> > >>> >> >> >> > >> > >> >>> >>>>>>>>> > >>> >> >> >> > >> > >> >>> >>>>>>>>> > >>> >> >> >> > >> > >> >>> >>>>>>>>> Hi Flavio, do you think those jiras > can > >>> get > >>> >> >> >> > >> > >> reviewed/finalized > >>> >> >> >> > >> > >> >>> before > >>> >> >> >> > >> > >> >>> >>>>>>>>> the end of the week? I'd like to try > >>> cutting > >>> >> an > >>> >> >> >> > >> > >> >>> >>>>>>>>> RC > >>> >> >> >> > >> > soonish... > >>> >> >> >> > >> > >> >>> >>>>>>>>> > >>> >> >> >> > >> > >> >>> >>>>>>>>> Patrick > >>> >> >> >> > >> > >> >>> >>>>>>>>> > >>> >> >> >> > >> > >> >>> >>>>>>>>> > >>> >> >> >> > >> > >> >>> >>>>>>>>> On Sun, Jun 29, 2014 at 5:02 AM, > Flavio > >>> >> >> >> > >> > >> >>> >>>>>>>>> Junqueira > <fpjunque...@yahoo.com.invalid> > >>> >> >> wrote: > >>> >> >> >> > >> > >> >>> >>>>>>>>>> +1 for the plan of releasing alpha > >>> versions. > >>> >> >> >> > >> > >> >>> >>>>>>>>>> > >>> >> >> >> > >> > >> >>> >>>>>>>>>> I'd like to have ZK-1818 (ZK-1810) > and > >>> >> ZK-1863 > >>> >> >> in. > >>> >> >> >> > >> > >> >>> >>>>>>>>>> They are > >>> >> >> >> > >> > >> both > >>> >> >> >> > >> > >> >>> >>>> patch > >>> >> >> >> > >> > >> >>> >>>>>>> available. ZK-1870 is in trunk, but it > is > >>> still > >>> >> >> >> > >> > >> >>> >>>>>>> open because we > >>> >> >> >> > >> > >> >>> need a > >>> >> >> >> > >> > >> >>> >>>> 3.4 > >>> >> >> >> > >> > >> >>> >>>>>>> patch. > >>> >> >> >> > >> > >> >>> >>>>>>>>>> > >>> >> >> >> > >> > >> >>> >>>>>>>>>> -Flavio > >>> >> >> >> > >> > >> >>> >>>>>>>>>> > >>> >> >> >> > >> > >> >>> >>>>>>>>>> > >>> >> >> >> > >> > >> >>> >>>>>>>>>> On 26 Jun 2014, at 01:07, Patrick > Hunt > >>> >> >> >> > >> > >> >>> >>>>>>>>>> <ph...@apache.org> > >>> >> >> >> > >> > >> >>> wrote: > >>> >> >> >> > >> > >> >>> >>>>>>>>>> > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> Hey folks, we've been talking about > it > >>> for > >>> >> a > >>> >> >> >> > while, a > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> few > >>> >> >> >> > >> > >> >>> people > >>> >> >> >> > >> > >> >>> >>>> have > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> mentioned on the list as well as > >>> contacted > >>> >> me > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> personally > >>> >> >> >> > >> > >> that > >>> >> >> >> > >> > >> >>> they > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> would like to see some progress on > the > >>> >> first > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> 3.5 > >>> >> >> >> > >> > release. > >>> >> >> >> > >> > >> Every > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> release is a compromise, if we wait > for > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> perfection we'll > >>> >> >> >> > >> > >> never > >>> >> >> >> > >> > >> >>> get > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> anything out the door. 3.5 has tons > of > >>> >> great > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> new features, > >>> >> >> >> > >> > >> >>> lots of > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> hard work, let's get it out in a > >>> release so > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> that folks can > >>> >> >> >> > >> > >> use > >>> >> >> >> > >> > >> >>> it, > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> test it, and give feedback. > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> Jenkins jobs have been pretty stable > >>> except > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> for the known > >>> >> >> >> > >> > >> >>> flakey > >>> >> >> >> > >> > >> >>> >>>> test > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> ZOOKEEPER-1870 which Flavio > committed > >>> >> today to > >>> >> >> >> > >> > trunk. > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> Note > >>> >> >> >> > >> > >> that > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> jenkins has also been verifying the > >>> code on > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> jdk7 > >>> >> >> >> > and > >>> >> >> >> > >> > jdk8. > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> Here's my thinking again on how we > >>> should > >>> >> plan > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> our > >>> >> >> >> > >> > >> releases: > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> I don't think we'll be able to do a > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> 3.5.x-stable > >>> >> >> >> > for > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> some > >>> >> >> >> > >> > >> time. > >>> >> >> >> > >> > >> >>> >>>> What I > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> think we should do instead is > similar to > >>> >> what > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> we > >>> >> >> >> > did > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> for > >>> >> >> >> > >> > >> 3.4. > >>> >> >> >> > >> > >> >>> >>>> (this is > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> also similar to what Hadoop did > during > >>> >> their > >>> >> >> >> > Hadoop 2 > >>> >> >> >> > >> > >> release > >>> >> >> >> > >> > >> >>> >>>> cycle) > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> Start with a series of alpha > releases, > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> something people > >>> >> >> >> > >> > >> can run > >>> >> >> >> > >> > >> >>> >>>> and > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> test with, once we address all the > >>> blockers > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> and > >>> >> >> >> > feel > >>> >> >> >> > >> > >> >>> comfortable > >>> >> >> >> > >> > >> >>> >>>> with > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> the apis & remaining jiras we then > >>> switch > >>> >> to > >>> >> >> >> beta. > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> Once we > >>> >> >> >> > >> > >> get > >>> >> >> >> > >> > >> >>> >>>> some > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> good feedback we remove the > alpha/beta > >>> >> moniker > >>> >> >> >> > >> > and > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> look at > >>> >> >> >> > >> > >> >>> making > >>> >> >> >> > >> > >> >>> >>>> it > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> "stable'. At some later point it > will > >>> >> become > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> the > >>> >> >> >> > >> > >> >>> "current/stable" > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> release, taking over from 3.4.x. > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> e.g. > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> 3.5.0-alpha (8 blockers) > 3.5.1-alpha (3 > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> blockers) 3.5.2-alpha (0 blockers) > >>> >> 3.5.3-beta > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> (apis locked) 3.5.4-beta 3.5.5-beta > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> 3.5.6 (no longer considered > alpha/beta > >>> but > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> also not > >>> >> >> >> > >> > >> "stable" vs > >>> >> >> >> > >> > >> >>> >>>> 3.4.x, > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> maybe use it for production but we > still > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> expect things to > >>> >> >> >> > >> > >> shake > >>> >> >> >> > >> > >> >>> >>>> out) > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> 3.5.7 > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> .... > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> 3.5.x - ready to replace 3.4 > releases > >>> for > >>> >> >> >> > production > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> use, > >>> >> >> >> > >> > >> >>> stable, > >>> >> >> >> > >> > >> >>> >>>>>>> etc... > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> There are 8 blockers currently, are > any > >>> of > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> these something > >>> >> >> >> > >> > >> that > >>> >> >> >> > >> > >> >>> >>>> should > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> hold up 3.5.0-alpha? > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> I'll hold open the discussion for a > >>> couple > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> days. If folks > >>> >> >> >> > >> > >> find > >>> >> >> >> > >> > >> >>> >>>> this a > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> reasonable plan I'll start the ball > >>> >> rolling to > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> cut > >>> >> >> >> > an > >>> >> >> >> > >> RC. > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> > >>> >> >> >> > >> > >> >>> >>>>>>>>>>> Patrick > >>> >> >> >> > >> > >> >>> >>>>>>>>>> > >>> >> >> >> > >> > >> >>> >>>>>>>>> > >>> >> >> >> > >> > >> >>> >>>>>>>>> > >>> >> >> >> > >> > >> >>> >>>>>>>>> > >>> >> >> >> > >> > >> >>> >>>>>>> > >>> >> >> >> > >> > >> >>> >>>>>> > >>> >> >> >> > >> > >> >>> >>>>>> > >>> >> >> >> > >> > >> >>> >>>>>> > >>> >> >> >> > >> > >> >>> > > >>> >> >> >> > >> > >> >>> > >>> >> >> >> > >> > >> > >>> >> >> >> > >> > >>> >> >> >> > >> > >>> >> >> >> > > >>> >> >> >> > >>> >> >> > >>> >> > >>> >