Good news: the patch build ran successfully and gave every check a +1. What's next to get this into trunk?
On Thu, May 19, 2011 at 2:05 PM, Patrick Hunt <[email protected]> wrote: > Hi Ketan, sorry about this. A number of build folks have looked but > can't seem to figure out what's wrong on some of these build hosts. > Running "java" just fails for no reason. > > Nigel and I spent part of the day yesterday looking into this with no > luck. For the time being I've pinned the job down to hadoop9 (where > java seems to be picked up fine). You should see a report come through > shortly. > > Feel free to reach out to me personally wrt finalizing this issue. > > Patrick > > On Tue, May 17, 2011 at 6:07 PM, Ketan Gangatirkar <[email protected]> wrote: >> Hi. Has there been any progress on this? Thanks. >> >> On Fri, May 6, 2011 at 11:32 AM, Patrick Hunt <[email protected]> wrote: >>> Mahadev is working with Giri to address. The jenkins folks are saying >>> this is a machine administered by Yahoo and the issue needs to be >>> address with them (their admins, but Mahadev/Giri are looking into it >>> from our (zk) side). >>> >>> Patrick >>> >>> On Fri, May 6, 2011 at 4:33 AM, Ketan Gangatirkar <[email protected]> wrote: >>>> Hi, Patrick. Were you able to get any assistance from the hudson >>>> admins? Thanks. >>>> >>>> On Wed, May 4, 2011 at 12:53 PM, Patrick Hunt <[email protected]> wrote: >>>>> This is odd, it's failing in the c tests but for a weird reason: >>>>> >>>>> in: >>>>> https://builds.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/247/artifact/trunk/build/tmp/zk.log >>>>> >>>>> it says: >>>>> /grid/0/hudson/hudson-slave/workspace/PreCommit-ZOOKEEPER-Build/trunk/src/c/tests/zkServer.sh: >>>>> line 115: java: command not found >>>>> >>>>> I'll ping the hudson admins and see if this is a known issue (also >>>>> hudson is very slow today for some reason). >>>>> >>>>> Once that's addressed we should be good to go. >>>>> >>>>> Patrick >>>>> >>>>> On Wed, May 4, 2011 at 9:57 AM, Ketan Gangatirkar <[email protected]> >>>>> wrote: >>>>>> Got the patch formatted right and applying successfully, now I'll see >>>>>> if I can figure out the unit test failure. >>>>>> >>>>>> On Wed, May 4, 2011 at 11:26 AM, Patrick Hunt <[email protected]> wrote: >>>>>>> Hi Ketan, the patch is failing to apply >>>>>>> https://builds.apache.org/hudson/job/PreCommit-ZOOKEEPER-Build/246//console >>>>>>> >>>>>>> Looks like you used git, I usually do something like: >>>>>>> git diff rev1..rev2 --no-prefix > ZOOKEEPER-784.patch >>>>>>> can you give it another try? >>>>>>> >>>>>>> Patrick >>>>>>> >>>>>>> On Tue, May 3, 2011 at 6:42 PM, Ketan Gangatirkar <[email protected]> >>>>>>> wrote: >>>>>>>> I have updated Sergey's patch to: >>>>>>>> >>>>>>>> * apply to current trunk >>>>>>>> * incorporate one trivial output change he made to StatCommand in >>>>>>>> NettyServerCnxn.java >>>>>>>> * change log4j references to slf4j >>>>>>>> >>>>>>>> I have successfully run ant releaseaudit on the result. The updated >>>>>>>> patch is now attached to the issue: >>>>>>>> >>>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-784 >>>>>>>> >>>>>>>> I do *not* make any claim to have understood the contents of this >>>>>>>> patch; all I did was synch everything and fix the obvious log4j/slf4j >>>>>>>> change. Now what? >>>>>>>> >>>>>>>> >>>>>>>> On Tue, May 3, 2011 at 5:46 PM, Patrick Hunt <[email protected]> wrote: >>>>>>>>> The core tests failed on last hudson, I just kicked off a patch build, >>>>>>>>> seems recent changes (logging?) have caused the patch to stop >>>>>>>>> applying: >>>>>>>>> https://hudson.apache.org/hudson/view/S-Z/view/ZooKeeper/job/PreCommit-ZOOKEEPER-Build/238/console >>>>>>>>> >>>>>>>>> Ketan would you like to try updating the patch and resubmit? >>>>>>>>> >>>>>>>>> Patrick >>>>>>>>> >>>>>>>>> On Tue, May 3, 2011 at 3:31 PM, Ketan Gangatirkar <[email protected]> >>>>>>>>> wrote: >>>>>>>>>> Thanks, Mahadev. I had seen ZOOKEEPER-892 but not ZOOKEEPER-784. >>>>>>>>>> The >>>>>>>>>> latter may be what we need. >>>>>>>>>> >>>>>>>>>> I read the comments attached to that issue. The most recent comment >>>>>>>>>> was a Hudson CI message indicating that the tests against the patch >>>>>>>>>> failed. I was not able to find out more as it appears that the >>>>>>>>>> configuration of the Apache Hudson has changed. It appears that the >>>>>>>>>> patch was approved but not merged into trunk, and it's now in limbo. >>>>>>>>>> What is necessary to get that feature into the next release? I may >>>>>>>>>> be >>>>>>>>>> able to assist, depending on what's involved. Thank you. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Tue, May 3, 2011 at 4:17 PM, Mahadev Konar <[email protected]> >>>>>>>>>> wrote: >>>>>>>>>>> Hi Ketan, >>>>>>>>>>> You are correct that observers need connection to quorum as well. >>>>>>>>>>> There have been quite a few discussions on multi colo replication >>>>>>>>>>> and >>>>>>>>>>> read only mode of ZooKeeper. >>>>>>>>>>> >>>>>>>>>>> Here are the jiras for those: >>>>>>>>>>> >>>>>>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-784 >>>>>>>>>>> and >>>>>>>>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-892 >>>>>>>>>>> >>>>>>>>>>> These have been mostly targeted at exactly a use case like yours. >>>>>>>>>>> Please take a look and them and feel free to contribute/comment on >>>>>>>>>>> the >>>>>>>>>>> jiras. >>>>>>>>>>> >>>>>>>>>>> -- >>>>>>>>>>> thanks >>>>>>>>>>> mahadev >>>>>>>>>>> @mahadevkonar >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On Tue, May 3, 2011 at 2:07 PM, Ketan Gangatirkar >>>>>>>>>>> <[email protected]> wrote: >>>>>>>>>>>> Hi. We're considering ZooKeeper for coordinating operations across >>>>>>>>>>>> multiple data centers. These data centers will occasionally be >>>>>>>>>>>> disconnected. We were planning on using observers in remote data >>>>>>>>>>>> centers. Our applications can survive being unable to *write* to >>>>>>>>>>>> ZooKeeper, but they do need to be able to read from it, even if the >>>>>>>>>>>> data were stale. >>>>>>>>>>>> >>>>>>>>>>>> On further examination, it looks like observers must always be >>>>>>>>>>>> connected to the quorum to function at all. Is this correct? Does >>>>>>>>>>>> anyone have suggestions for how to work around this problem? The >>>>>>>>>>>> first thing that comes to mind is duplicating the required data in >>>>>>>>>>>> some other local data store and falling back on that when the DC >>>>>>>>>>>> becomes disconnected. I imagine the disadvantages of that are >>>>>>>>>>>> obvious >>>>>>>>>>>> to everyone. I hope someone can share some great idea that allows >>>>>>>>>>>> me >>>>>>>>>>>> to avoid that miserable fate. Thanks. >>>>>>>>>>>> >>>>>>>>>>>> -- >>>>>>>>>>>> Ketan Gangatirkar >>>>>>>>>>>> [email protected] >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> -- >>>>>>>>>> Ketan Gangatirkar >>>>>>>>>> [email protected] >>>>>>>>>> Perishable Developer >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> Ketan Gangatirkar >>>>>>>> [email protected] >>>>>>>> Perishable Developer >>>>>>>> >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> Ketan Gangatirkar >>>>>> [email protected] >>>>>> Perishable Developer >>>>>> >>>>> >>>> >>>> >>>> >>>> -- >>>> Ketan Gangatirkar >>>> [email protected] >>>> Perishable Developer >>>> >>> >> >> >> >> -- >> Ketan Gangatirkar >> [email protected] >> Perishable Developer >> > -- Ketan Gangatirkar [email protected] Perishable Developer
