I see the problem. The fix is not simple though so I’ll spend some time on it. The TL;DR is that exists watchers are still supposed to get set when there is a KeeperException.NoNode and the code isn’t handling it. But, while I was looking at the code I realized there are some significant additional problems. Curator, here, is trying to mirror what ZooKeeper does internally which is insanely complicated. In hindsight, the whole ZK watcher mechanism should’ve been decoupled from the mutator APIs. But, of course, that’s easy for me to say now.
-Jordan > On May 26, 2016, at 1:10 AM, Cameron McKenzie <[email protected]> wrote: > > Thanks Scott, > Those tests are now passing for me. > > Jordan, testNodeCache:testBasics() is failing consistently on the 3.0 > branch. It appears that this is actually potentially a bug in the > NodeCache. It ends up leaking a Watcher reference. I've had a quick look > through, but I haven't dived in in any detail. It's the > WatcherRemovalManager stuff I think. If you've got time, can you have a > look? If not, let me know and I'll do some more digging. > > cheers > > On Thu, May 26, 2016 at 11:47 AM, Cameron McKenzie <[email protected]> > wrote: > >> Thanks Scott. >> >> Push the fix to master and merge it into 3.0. >> >> Then I guess, I'll push new versions of 2.11 and 3.2 onto Nexus. >> cheers >> >> On Thu, May 26, 2016 at 11:44 AM, Scott Blum <[email protected]> >> wrote: >> >>> Alright, I have a fix, but it wants to be applied to both master and 3.0. >>> Where should I push the fix? >>> >>> On Wed, May 25, 2016 at 6:10 PM, Cameron McKenzie <[email protected] >>>> >>> wrote: >>> >>>> Thanks Scott, >>>> If you just checkout the CURATOR-3.0 branch, they are failing there. >>>> cheers >>>> >>>> On Thu, May 26, 2016 at 2:06 AM, Scott Blum <[email protected]> >>> wrote: >>>> >>>>> Sure, what SHA are they failing at Cam? >>>>> >>>>> On Wed, May 25, 2016 at 9:36 AM, Jordan Zimmerman < >>>>> [email protected]> wrote: >>>>> >>>>>> Scott can you take a look? >>>>>> >>>>>> -Jordan >>>>>> >>>>>>> On May 25, 2016, at 4:35 AM, Cameron McKenzie < >>>> [email protected]> >>>>>> wrote: >>>>>>> >>>>>>> Tree cache tests are still failing. I've tried a few times but no >>>> love: >>>>>>> >>>>>>> TestTreeCacheEventOrdering>TestEventOrdering.testEventOrdering:151 >>>>>> actual 6 >>>>>>> expected -31: >>>>>>> >>>>>>> I will have a look into what's going on in the morning. Given that >>>>> these >>>>>>> may take some messing about to fix up, do we just want to vote on >>>>> 2.11.0 >>>>>>> separately, as that is all ready to go? >>>>>>> cheers >>>>>>> >>>>>>> On Wed, May 25, 2016 at 5:34 PM, Jordan Zimmerman < >>>>>>> [email protected]> wrote: >>>>>>> >>>>>>>> Great news. Thanks. >>>>>>>> >>>>>>>> ==================== >>>>>>>> Jordan Zimmerman >>>>>>>> >>>>>>>>> On May 25, 2016, at 12:37 AM, Cameron McKenzie < >>>>> [email protected] >>>>>>> >>>>>>>> wrote: >>>>>>>>> >>>>>>>>> I have fixed up the test case, all good now. >>>>>>>>> >>>>>>>>> On Wed, May 25, 2016 at 1:45 PM, Cameron McKenzie < >>>>>>>> [email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> Looks like it was introduced with the schema validation stuff. >>> It >>>>> now >>>>>>>> does >>>>>>>>>> an ACL check prior to the backgrounding call. Because the unit >>>> test >>>>>>>> uses a >>>>>>>>>> bogus ACL provider it just throws an exception >>>>>>>>>> >>>>>>>>>> final String adjustedPath = >>>>>>>>>> adjustPath(client.fixForNamespace(givenPath, >>>>>>>> createMode.isSequential())); >>>>>>>>>> List<ACL> aclList = acling.getAclList(adjustedPath); >>>>>>>>>> >>>>>>>>>> >>>>> client.getSchemaSet().getSchema(givenPath).validateCreate(createMode, >>>>>>>> data, >>>>>>>>>> aclList); >>>>>>>>>> >>>>>>>>>> String returnPath = null; >>>>>>>>>> if ( backgrounding.inBackground() ) >>>>>>>>>> { >>>>>>>>>> pathInBackground(adjustedPath, data, givenPath); >>>>>>>>>> >>>>>>>>>> So, I guess the answer is to get the test to force a failure >>> in a >>>>>>>>>> different way. With the UnhandledErrorListener, the >>> expectation is >>>>>> that >>>>>>>>>> this only gets called on backgrounding operations? >>>>>>>>>> cheers >>>>>>>>>> >>>>>>>>>> On Wed, May 25, 2016 at 1:39 PM, Cameron McKenzie < >>>>>>>> [email protected]> >>>>>>>>>> wrote: >>>>>>>>>> >>>>>>>>>>> Same on the master branch, but it passes there, so maybe >>>> something >>>>>> has >>>>>>>>>>> legitimately broken the test. Will let you know if I get >>> stuck. >>>>>>>>>>> cheers >>>>>>>>>>> >>>>>>>>>>> On Wed, May 25, 2016 at 1:36 PM, Jordan Zimmerman < >>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>> >>>>>>>>>>>> Let me know if you need help. >>>>>>>>>>>> >>>>>>>>>>>> It might be a bad merge. Have you compared it to the master >>>>> branch? >>>>>>>>>>>> >>>>>>>>>>>> -JZ >>>>>>>>>>>> >>>>>>>>>>>>>> On May 24, 2016, at 10:31 PM, Cameron McKenzie < >>>>>>>> [email protected]> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Guys, >>>>>>>>>>>>> There's a test TestFrameworkBackground:testErrorListener >>> that >>>> is >>>>>>>>>>>> failing >>>>>>>>>>>>> consistently on the CURATOR-3.0 branch. >>>>>>>>>>>>> >>>>>>>>>>>>> I can't see how it has ever worked. It seems to try and >>> provoke >>>>> an >>>>>>>>>>>> error >>>>>>>>>>>>> via a bad ACL provider. >>>>>>>>>>>>> >>>>>>>>>>>>> But this ACL provider is called by the CreateBuilderImpl >>> prior >>>> to >>>>>> the >>>>>>>>>>>>> backgrounding call, which means that the exception that it >>>> throws >>>>>>>>>>>> happens >>>>>>>>>>>>> in the main Thread of the unit test. So, it just throws an >>>>>>>>>>>>> UnsupportedOperationException which is propogated up the >>> stack >>>> at >>>>>>>> which >>>>>>>>>>>>> point the unit test fails. >>>>>>>>>>>>> >>>>>>>>>>>>> So, I will look at fixing this, but I just don't understand >>> how >>>>> it >>>>>>>> ever >>>>>>>>>>>>> worked? >>>>>>>>>>>>> cheers >>>>>>>>>> >>>>>>>> >>>>>> >>>>>> >>>>> >>>> >>> >> >>
