Hmm - I’m still getting failures - maybe I’m wrong. It’s late and I’m off to bed. I’ll look at this more tomorrow.
-Jordan > On Jun 1, 2016, at 10:59 PM, Cameron McKenzie <[email protected]> wrote: > > The counter is just being used to check if semaphores are still being > acquired. Essentially it just runs in a loop acquiring semaphores (and > incrementing the counter when they are acquired). > > Then it shuts down the server, waits until it the session is lost, then > restarts the server and then checks that semaphores are being acquired > correctly again (by checking that the counter is being incremented). > > This is just a simplified version of the test that is failing. > > When the test fails, all of the threads are attempting to get a lease on > the semaphore, but none of them get it, then the test times out while > waiting. > > > > On Thu, Jun 2, 2016 at 1:29 PM, Jordan Zimmerman <[email protected] >> wrote: > >> I also had to add: >> >> while(!lost.get() && (counter.get() > 0)) >> { >> Thread.sleep(1000); >> } >> Which seems more correct to me. >> >>> On Jun 1, 2016, at 9:07 PM, Cameron McKenzie <[email protected]> >> wrote: >>> >>> I have just pushed an interprocess_mutex_issue branch. The test case is >> in >>> TestInterprocessMutexNotReconnecting >>> >>> For me it's failing around 20% of the time. >>> cheers >>> >>> On Thu, Jun 2, 2016 at 11:17 AM, Cameron McKenzie < >> [email protected]> >>> wrote: >>> >>>> Yep, just let me confirm that it's actually getting the same problem. >> I'm >>>> sure it was before, but I've just run it a bunch of times and >> everything's >>>> been fine. >>>> >>>> On Thu, Jun 2, 2016 at 11:15 AM, Jordan Zimmerman < >>>> [email protected]> wrote: >>>> >>>>> Can you push your unit test somewhere? >>>>> >>>>>> On Jun 1, 2016, at 7:37 PM, Cameron McKenzie <[email protected]> >>>>> wrote: >>>>>> >>>>>> Indeed. There seems to be a problem with InterProcessSemaphoreV2 >> though. >>>>>> I've written a simplified unit test that just has a bunch of clients >>>>>> attempting to grab a lease on the semaphore. When I shutdown and >>>>> restart ZK >>>>>> about 25% of the time, none of the clients can reacquire the >> semaphore. >>>>>> >>>>>> Still trying to work out what's going on, but I'm probably not going >> to >>>>>> have a lot of time today to look at it. >>>>>> cheers >>>>>> >>>>>> On Thu, Jun 2, 2016 at 10:30 AM, Jordan Zimmerman < >>>>>> [email protected]> wrote: >>>>>> >>>>>>> Odd - SemaphoreClient does seem wrong. >>>>>>> >>>>>>>> On Jun 1, 2016, at 1:43 AM, Cameron McKenzie < >> [email protected]> >>>>>>> wrote: >>>>>>>> >>>>>>>> It looks like under some circumstances (which I haven't worked out >>>>> yet) >>>>>>>> that the InterprocessMutex acquire() is not working correctly when >>>>>>>> reconnecting to ZK. Still digging into why this is. >>>>>>>> >>>>>>>> There also seems to be a bug in the SemaphoreClient, unless I'm >>>>> missing >>>>>>>> something. At lines 126 and 140 it does compareAndSet() calls but >>>>> throws >>>>>>> an >>>>>>>> exception if they return true. As far as I can work out, this means >>>>> that >>>>>>>> whenever the lock is acquired, an exception gets thrown indicating >>>>> that >>>>>>>> there are Multiple acquirers. >>>>>>>> >>>>>>>> This test is failing fairly consistently. It seems to be the >> remaining >>>>>>> test >>>>>>>> that keeps failing in the Jenkins build also >>>>>>>> cheers >>>>>>>> >>>>>>>> >>>>>>>> On Wed, Jun 1, 2016 at 3:10 PM, Cameron McKenzie < >>>>> [email protected] >>>>>>>> >>>>>>>> wrote: >>>>>>>> >>>>>>>>> Looks like I was incorrect. The NoWatcherException is being thrown >> on >>>>>>>>> success as well, and the problem is not in the cluster restart. >> Will >>>>>>> keep >>>>>>>>> digging. >>>>>>>>> >>>>>>>>> On Wed, Jun 1, 2016 at 2:52 PM, Cameron McKenzie < >>>>>>> [email protected]> >>>>>>>>> wrote: >>>>>>>>> >>>>>>>>>> TestInterProcessSemaphoreCluster.testCluster() is failling >>>>> (assertion >>>>>>> at >>>>>>>>>> line 294). Again, it seems like some sort of race condition with >> the >>>>>>>>>> watcher removal. >>>>>>>>>> >>>>>>>>>> When I run it in Eclipse, it fails maybe 25% of the time. When it >>>>> fails >>>>>>>>>> it seems that it's got something to do with watcher removal. When >>>>> the >>>>>>> test >>>>>>>>>> passes, this error is not logged. >>>>>>>>>> >>>>>>>>>> org.apache.zookeeper.KeeperException$NoWatcherException: >>>>>>> KeeperErrorCode >>>>>>>>>> = No such watcher for /foo/bar/lock/leases >>>>>>>>>> at >>>>>>>>>> >>>>>>> >>>>> >> org.apache.zookeeper.ZooKeeper$ZKWatchManager.containsWatcher(ZooKeeper.java:377) >>>>>>>>>> at >>>>>>>>>> >>>>>>> >>>>> >> org.apache.zookeeper.ZooKeeper$ZKWatchManager.removeWatcher(ZooKeeper.java:252) >>>>>>>>>> at >>>>>>>>>> >>>>>>> >>>>> >> org.apache.zookeeper.WatchDeregistration.unregister(WatchDeregistration.java:58) >>>>>>>>>> at >> org.apache.zookeeper.ClientCnxn.finishPacket(ClientCnxn.java:712) >>>>>>>>>> at org.apache.zookeeper.ClientCnxn.access$1500(ClientCnxn.java:97) >>>>>>>>>> at >>>>>>>>>> >>>>>>> >>>>> >> org.apache.zookeeper.ClientCnxn$SendThread.readResponse(ClientCnxn.java:948) >>>>>>>>>> at >>>>>>>>>> >>>>>>> >>>>> >> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:99) >>>>>>>>>> at >>>>>>>>>> >>>>>>> >>>>> >> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361) >>>>>>>>>> at >>>>> org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1236) >>>>>>>>>> >>>>>>>>>> Is it possible it's something to do with the way that the cluster >> is >>>>>>>>>> restarted at line 282? The old cluster is not shutdown, a new one >> is >>>>>>> just >>>>>>>>>> created. >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On Wed, Jun 1, 2016 at 10:44 AM, Jordan Zimmerman < >>>>>>>>>> [email protected]> wrote: >>>>>>>>>> >>>>>>>>>>> I’ll try to address this as part of CURATOR-333 >>>>>>>>>>> >>>>>>>>>>>> On May 31, 2016, at 7:08 PM, Cameron McKenzie < >>>>>>> [email protected]> >>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>> Maybe we need to look at some way of providing a hook for tests >> to >>>>>>> wait >>>>>>>>>>>> reliably for asynch tasks to finish? >>>>>>>>>>>> >>>>>>>>>>>> The latest round of tests ran OK. One test failed on an >> unrelated >>>>>>> thing >>>>>>>>>>>> (ConnectionLoss), but this appears to be a transient thing as >> it's >>>>>>>>>>> worked >>>>>>>>>>>> ok the next time around. >>>>>>>>>>>> >>>>>>>>>>>> I will start getting a release together. Thanks for you help >> with >>>>> the >>>>>>>>>>>> updated tests. >>>>>>>>>>>> cheers >>>>>>>>>>>> >>>>>>>>>>>> On Wed, Jun 1, 2016 at 9:12 AM, Jordan Zimmerman < >>>>>>>>>>> [email protected] >>>>>>>>>>>>> wrote: >>>>>>>>>>>> >>>>>>>>>>>>> The problem is in-flight watchers and async background calls. >>>>>>> There’s >>>>>>>>>>> no >>>>>>>>>>>>> way to cancel these and they can take time to occur - even >> after >>>>> a >>>>>>>>>>> recipe >>>>>>>>>>>>> instance is closed. >>>>>>>>>>>>> >>>>>>>>>>>>> -Jordan >>>>>>>>>>>>> >>>>>>>>>>>>>> On May 31, 2016, at 5:11 PM, Cameron McKenzie < >>>>>>>>>>> [email protected]> >>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>> Ok, running it again now. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Is the problem that the watcher clean up for the recipes is >> done >>>>>>>>>>>>>> asynchronously after they are closed? >>>>>>>>>>>>>> >>>>>>>>>>>>>> On Wed, Jun 1, 2016 at 1:35 AM, Jordan Zimmerman < >>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> OK - please try now. I added a loop in the “no watchers” >>>>> checker. >>>>>>> If >>>>>>>>>>>>> there >>>>>>>>>>>>>>> are remaining watchers, it will sleep a bit and try again. >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> -Jordan >>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On May 31, 2016, at 1:33 AM, Cameron McKenzie < >>>>>>>>>>> [email protected]> >>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> Looks like these failures are intermittent. Running them >>>>> directly >>>>>>>>>>> in >>>>>>>>>>>>>>>> Eclipse they seem to be passing. I will run the whole thing >>>>> again >>>>>>>>>>> in >>>>>>>>>>>>> the >>>>>>>>>>>>>>>> morning and see how it goes. >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>> On Tue, May 31, 2016 at 2:29 PM, Cameron McKenzie < >>>>>>>>>>>>>>> [email protected]> >>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> There are still 2 tests failing for me: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> FAILURE! - in >>>>>>>>>>>>>>>>> >>>>> org.apache.curator.framework.recipes.cache.TestPathChildrenCache >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>> >>>>> >> testKilledSession(org.apache.curator.framework.recipes.cache.TestPathChildrenCache) >>>>>>>>>>>>>>>>> Time elapsed: 17.488 sec <<< FAILURE! >>>>>>>>>>>>>>>>> java.lang.AssertionError: One or more child watchers are >>>>> still >>>>>>>>>>>>>>> registered: >>>>>>>>>>>>>>>>> [/test] >>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>> >>>>> >> org.apache.curator.framework.imps.TestCleanState.closeAndTestClean(TestCleanState.java:53) >>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>> >>>>> >> org.apache.curator.framework.recipes.cache.TestPathChildrenCache.testKilledSession(TestPathChildrenCache.java:707) >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> FAILURE! - in >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>> >>>>> >> org.apache.curator.framework.recipes.locks.TestInterProcessSemaphoreCluster >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>> >>>>> >> testCluster(org.apache.curator.framework.recipes.locks.TestInterProcessSemaphoreCluster) >>>>>>>>>>>>>>>>> Time elapsed: 96.641 sec <<< FAILURE! >>>>>>>>>>>>>>>>> java.lang.AssertionError: expected [true] but found [false] >>>>>>>>>>>>>>>>> at org.testng.Assert.fail(Assert.java:94) >>>>>>>>>>>>>>>>> at org.testng.Assert.failNotEquals(Assert.java:494) >>>>>>>>>>>>>>>>> at org.testng.Assert.assertTrue(Assert.java:42) >>>>>>>>>>>>>>>>> at org.testng.Assert.assertTrue(Assert.java:52) >>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>> >>>>> >> org.apache.curator.framework.recipes.locks.TestInterProcessSemaphoreCluster.testCluster(TestInterProcessSemaphoreCluster.java:294) >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Failed tests: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>> >>>>> >> org.apache.curator.framework.recipes.cache.TestPathChildrenCache.testKilledSession(org.apache.curator.framework.recipes.cache.TestPathChildrenCache) >>>>>>>>>>>>>>>>> Run 1: TestPathChildrenCache.testKilledSession:707 One or >>>>> more >>>>>>>>>>> child >>>>>>>>>>>>>>>>> watchers are still registered: [/test] >>>>>>>>>>>>>>>>> Run 2: PASS >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> TestInterProcessSemaphoreCluster.testCluster:294 expected >>>>> [true] >>>>>>>>>>> but >>>>>>>>>>>>>>>>> found [false] >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> Tests run: 495, Failures: 2, Errors: 0, Skipped: 0 >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> On Tue, May 31, 2016 at 12:52 PM, Cameron McKenzie < >>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> Thanks, CURATOR-332 wasn't pushed. I will run the tests >>>>> against >>>>>>>>>>> that, >>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>>>> if it's all good will merge into CURATOR-3.0 >>>>>>>>>>>>>>>>>> cheers >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> On Tue, May 31, 2016 at 12:32 PM, Jordan Zimmerman < >>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> Actually - I don’t remember if branch CURATOR-332 is >> merged >>>>>>>>>>> yet. I >>>>>>>>>>>>>>>>>>> made/pushed my changes in CURATOR-332 >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> -jordan >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On May 26, 2016, at 10:04 PM, Cameron McKenzie < >>>>>>>>>>>>>>> [email protected]> >>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> I'm still seeing 6 failed tests that seem related to the >>>>> same >>>>>>>>>>> stuff >>>>>>>>>>>>>>>>>>> after >>>>>>>>>>>>>>>>>>>> merging your fix: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Failed tests: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>> >>>>> >> org.apache.curator.framework.recipes.cache.TestPathChildrenCache.testBasics(org.apache.curator.framework.recipes.cache.TestPathChildrenCache) >>>>>>>>>>>>>>>>>>>> Run 1: TestPathChildrenCache.testBasics:863 One or more >>>>> child >>>>>>>>>>>>>>> watchers >>>>>>>>>>>>>>>>>>>> are still registered: [/test] >>>>>>>>>>>>>>>>>>>> Run 2: TestPathChildrenCache.testBasics:863 One or more >>>>> child >>>>>>>>>>>>>>> watchers >>>>>>>>>>>>>>>>>>>> are still registered: [/test] >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>> >>>>> >> org.apache.curator.framework.recipes.cache.TestPathChildrenCache.testBasicsOnTwoCachesWithSameExecutor(org.apache.curator.framework.recipes.cache.TestPathChildrenCache) >>>>>>>>>>>>>>>>>>>> Run 1: >>>>>>>>>>>>>>> >> TestPathChildrenCache.testBasicsOnTwoCachesWithSameExecutor:934 >>>>>>>>>>>>>>>>>>>> One or more child watchers are still registered: [/test] >>>>>>>>>>>>>>>>>>>> Run 2: >>>>>>>>>>>>>>> >> TestPathChildrenCache.testBasicsOnTwoCachesWithSameExecutor:934 >>>>>>>>>>>>>>>>>>>> One or more child watchers are still registered: [/test] >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>> >>>>> >> org.apache.curator.framework.recipes.cache.TestPathChildrenCache.testEnsurePath(org.apache.curator.framework.recipes.cache.TestPathChildrenCache) >>>>>>>>>>>>>>>>>>>> Run 1: TestPathChildrenCache.testEnsurePath:363 One or >>>>> more >>>>>>>>>>> child >>>>>>>>>>>>>>>>>>>> watchers are still registered: [/one/two/three] >>>>>>>>>>>>>>>>>>>> Run 2: TestPathChildrenCache.testEnsurePath:363 One or >>>>> more >>>>>>>>>>> child >>>>>>>>>>>>>>>>>>>> watchers are still registered: [/one/two/three] >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> TestInterProcessSemaphoreCluster.testCluster:294 >> expected >>>>>>>>>>> [true] >>>>>>>>>>>>> but >>>>>>>>>>>>>>>>>>>> found [false] >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>> >>>>> >> org.apache.curator.framework.recipes.shared.TestSharedCount.testMultiClientVersioned(org.apache.curator.framework.recipes.shared.TestSharedCount) >>>>>>>>>>>>>>>>>>>> Run 1: PASS >>>>>>>>>>>>>>>>>>>> Run 2: TestSharedCount.testMultiClientVersioned:256 One >> or >>>>>>> more >>>>>>>>>>>>> data >>>>>>>>>>>>>>>>>>>> watchers are still registered: [/count] >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>> >>>>> >> org.apache.curator.framework.recipes.shared.TestSharedCount.testSimple(org.apache.curator.framework.recipes.shared.TestSharedCount) >>>>>>>>>>>>>>>>>>>> Run 1: TestSharedCount.testSimple:174 One or more data >>>>>>>>>>> watchers are >>>>>>>>>>>>>>>>>>> still >>>>>>>>>>>>>>>>>>>> registered: [/count] >>>>>>>>>>>>>>>>>>>> Run 2: TestSharedCount.testSimple:174 One or more data >>>>>>>>>>> watchers are >>>>>>>>>>>>>>>>>>> still >>>>>>>>>>>>>>>>>>>> registered: [/count] >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> Tests run: 491, Failures: 6, Errors: 0, Skipped: 0 >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>> On Fri, May 27, 2016 at 3:30 AM, Jordan Zimmerman < >>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> I see the problem. The fix is not simple though so I’ll >>>>>>> spend >>>>>>>>>>> some >>>>>>>>>>>>>>>>>>> time on >>>>>>>>>>>>>>>>>>>>> it. The TL;DR is that exists watchers are still >> supposed >>>>> to >>>>>>>>>>> get >>>>>>>>>>>>> set >>>>>>>>>>>>>>>>>>> when >>>>>>>>>>>>>>>>>>>>> there is a KeeperException.NoNode and the code isn’t >>>>>>> handling >>>>>>>>>>> it. >>>>>>>>>>>>>>> But, >>>>>>>>>>>>>>>>>>>>> while I was looking at the code I realized there are >> some >>>>>>>>>>>>>>> significant >>>>>>>>>>>>>>>>>>>>> additional problems. Curator, here, is trying to mirror >>>>> what >>>>>>>>>>>>>>>>>>> ZooKeeper does >>>>>>>>>>>>>>>>>>>>> internally which is insanely complicated. In hindsight, >>>>> the >>>>>>>>>>> whole >>>>>>>>>>>>> ZK >>>>>>>>>>>>>>>>>>>>> watcher mechanism should’ve been decoupled from the >>>>> mutator >>>>>>>>>>> APIs. >>>>>>>>>>>>>>>>>>> But, of >>>>>>>>>>>>>>>>>>>>> course, that’s easy for me to say now. >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> -Jordan >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On May 26, 2016, at 1:10 AM, Cameron McKenzie < >>>>>>>>>>>>>>>>>>> [email protected]> >>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Thanks Scott, >>>>>>>>>>>>>>>>>>>>>> Those tests are now passing for me. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> Jordan, testNodeCache:testBasics() is failing >>>>> consistently >>>>>>>>>>> on the >>>>>>>>>>>>>>> 3.0 >>>>>>>>>>>>>>>>>>>>>> branch. It appears that this is actually potentially a >>>>> bug >>>>>>>>>>> in the >>>>>>>>>>>>>>>>>>>>>> NodeCache. It ends up leaking a Watcher reference. >> I've >>>>>>> had a >>>>>>>>>>>>> quick >>>>>>>>>>>>>>>>>>> look >>>>>>>>>>>>>>>>>>>>>> through, but I haven't dived in in any detail. It's >> the >>>>>>>>>>>>>>>>>>>>>> WatcherRemovalManager stuff I think. If you've got >> time, >>>>>>> can >>>>>>>>>>> you >>>>>>>>>>>>>>>>>>> have a >>>>>>>>>>>>>>>>>>>>>> look? If not, let me know and I'll do some more >> digging. >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> cheers >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>> On Thu, May 26, 2016 at 11:47 AM, Cameron McKenzie < >>>>>>>>>>>>>>>>>>>>> [email protected]> >>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Thanks Scott. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Push the fix to master and merge it into 3.0. >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> Then I guess, I'll push new versions of 2.11 and 3.2 >>>>> onto >>>>>>>>>>> Nexus. >>>>>>>>>>>>>>>>>>>>>>> cheers >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> On Thu, May 26, 2016 at 11:44 AM, Scott Blum < >>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> Alright, I have a fix, but it wants to be applied to >>>>> both >>>>>>>>>>>>> master >>>>>>>>>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>>> 3.0. >>>>>>>>>>>>>>>>>>>>>>>> Where should I push the fix? >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> On Wed, May 25, 2016 at 6:10 PM, Cameron McKenzie < >>>>>>>>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> Thanks Scott, >>>>>>>>>>>>>>>>>>>>>>>>> If you just checkout the CURATOR-3.0 branch, they >> are >>>>>>>>>>> failing >>>>>>>>>>>>>>>>>>> there. >>>>>>>>>>>>>>>>>>>>>>>>> cheers >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> On Thu, May 26, 2016 at 2:06 AM, Scott Blum < >>>>>>>>>>>>>>>>>>> [email protected]> >>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> Sure, what SHA are they failing at Cam? >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, May 25, 2016 at 9:36 AM, Jordan Zimmerman >> < >>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> Scott can you take a look? >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> -Jordan >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> On May 25, 2016, at 4:35 AM, Cameron McKenzie < >>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> >>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> Tree cache tests are still failing. I've tried a >>>>> few >>>>>>>>>>> times >>>>>>>>>>>>>>> but >>>>>>>>>>>>>>>>>>> no >>>>>>>>>>>>>>>>>>>>>>>>> love: >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>> >> TestTreeCacheEventOrdering>TestEventOrdering.testEventOrdering:151 >>>>>>>>>>>>>>>>>>>>>>>>>>> actual 6 >>>>>>>>>>>>>>>>>>>>>>>>>>>> expected -31: >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> I will have a look into what's going on in the >>>>>>> morning. >>>>>>>>>>>>> Given >>>>>>>>>>>>>>>>>>> that >>>>>>>>>>>>>>>>>>>>>>>>>> these >>>>>>>>>>>>>>>>>>>>>>>>>>>> may take some messing about to fix up, do we >> just >>>>>>> want >>>>>>>>>>> to >>>>>>>>>>>>>>> vote >>>>>>>>>>>>>>>>>>> on >>>>>>>>>>>>>>>>>>>>>>>>>> 2.11.0 >>>>>>>>>>>>>>>>>>>>>>>>>>>> separately, as that is all ready to go? >>>>>>>>>>>>>>>>>>>>>>>>>>>> cheers >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, May 25, 2016 at 5:34 PM, Jordan >> Zimmerman >>>>> < >>>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Great news. Thanks. >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> ==================== >>>>>>>>>>>>>>>>>>>>>>>>>>>>> Jordan Zimmerman >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On May 25, 2016, at 12:37 AM, Cameron >> McKenzie < >>>>>>>>>>>>>>>>>>>>>>>>>> [email protected] >>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I have fixed up the test case, all good now. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, May 25, 2016 at 1:45 PM, Cameron >>>>> McKenzie < >>>>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Looks like it was introduced with the schema >>>>>>>>>>> validation >>>>>>>>>>>>>>>>>>> stuff. >>>>>>>>>>>>>>>>>>>>>>>> It >>>>>>>>>>>>>>>>>>>>>>>>>> now >>>>>>>>>>>>>>>>>>>>>>>>>>>>> does >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> an ACL check prior to the backgrounding call. >>>>>>>>>>> Because >>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>> unit >>>>>>>>>>>>>>>>>>>>>>>>> test >>>>>>>>>>>>>>>>>>>>>>>>>>>>> uses a >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> bogus ACL provider it just throws an >> exception >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> final String adjustedPath = >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> adjustPath(client.fixForNamespace(givenPath, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> createMode.isSequential())); >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> List<ACL> aclList = >>>>>>>>>>> acling.getAclList(adjustedPath); >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>> client.getSchemaSet().getSchema(givenPath).validateCreate(createMode, >>>>>>>>>>>>>>>>>>>>>>>>>>>>> data, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> aclList); >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> String returnPath = null; >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> if ( backgrounding.inBackground() ) >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> { >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> pathInBackground(adjustedPath, data, >>>>>>>>>>> givenPath); >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> So, I guess the answer is to get the test to >>>>>>> force a >>>>>>>>>>>>>>> failure >>>>>>>>>>>>>>>>>>>>>>>> in a >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> different way. With the >> UnhandledErrorListener, >>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>> expectation is >>>>>>>>>>>>>>>>>>>>>>>>>>> that >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> this only gets called on backgrounding >>>>> operations? >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> cheers >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, May 25, 2016 at 1:39 PM, Cameron >>>>> McKenzie >>>>>>> < >>>>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Same on the master branch, but it passes >>>>> there, >>>>>>> so >>>>>>>>>>>>> maybe >>>>>>>>>>>>>>>>>>>>>>>>> something >>>>>>>>>>>>>>>>>>>>>>>>>>> has >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> legitimately broken the test. Will let you >>>>> know >>>>>>> if >>>>>>>>>>> I >>>>>>>>>>>>> get >>>>>>>>>>>>>>>>>>>>>>>> stuck. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> cheers >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On Wed, May 25, 2016 at 1:36 PM, Jordan >>>>>>> Zimmerman < >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Let me know if you need help. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> It might be a bad merge. Have you compared >>>>> it to >>>>>>>>>>> the >>>>>>>>>>>>>>>>>>> master >>>>>>>>>>>>>>>>>>>>>>>>>> branch? >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> -JZ >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> On May 24, 2016, at 10:31 PM, Cameron >>>>>>> McKenzie < >>>>>>>>>>>>>>>>>>>>>>>>>>>>> [email protected]> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> wrote: >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> Guys, >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> There's a test >>>>>>>>>>>>>>> TestFrameworkBackground:testErrorListener >>>>>>>>>>>>>>>>>>>>>>>> that >>>>>>>>>>>>>>>>>>>>>>>>> is >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> failing >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> consistently on the CURATOR-3.0 branch. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> I can't see how it has ever worked. It >>>>> seems to >>>>>>>>>>> try >>>>>>>>>>>>> and >>>>>>>>>>>>>>>>>>>>>>>> provoke >>>>>>>>>>>>>>>>>>>>>>>>>> an >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> error >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> via a bad ACL provider. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> But this ACL provider is called by the >>>>>>>>>>>>>>> CreateBuilderImpl >>>>>>>>>>>>>>>>>>>>>>>> prior >>>>>>>>>>>>>>>>>>>>>>>>> to >>>>>>>>>>>>>>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> backgrounding call, which means that the >>>>>>>>>>> exception >>>>>>>>>>>>> that >>>>>>>>>>>>>>>>>>> it >>>>>>>>>>>>>>>>>>>>>>>>> throws >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> happens >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> in the main Thread of the unit test. So, >> it >>>>>>> just >>>>>>>>>>>>> throws >>>>>>>>>>>>>>>>>>> an >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> UnsupportedOperationException which is >>>>>>>>>>> propogated up >>>>>>>>>>>>>>> the >>>>>>>>>>>>>>>>>>>>>>>> stack >>>>>>>>>>>>>>>>>>>>>>>>> at >>>>>>>>>>>>>>>>>>>>>>>>>>>>> which >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> point the unit test fails. >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> So, I will look at fixing this, but I just >>>>>>> don't >>>>>>>>>>>>>>>>>>> understand >>>>>>>>>>>>>>>>>>>>>>>> how >>>>>>>>>>>>>>>>>>>>>>>>>> it >>>>>>>>>>>>>>>>>>>>>>>>>>>>> ever >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> worked? >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> cheers >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>> >>>>>>> >>>>> >>>>> >>>> >> >>
