I’ll have a look.
> On May 26, 2016, at 10:04 PM, Cameron McKenzie <[email protected]> wrote:
>
> I'm still seeing 6 failed tests that seem related to the same stuff after
> merging your fix:
>
> Failed tests:
> org.apache.curator.framework.recipes.cache.TestPathChildrenCache.testBasics(org.apache.curator.framework.recipes.cache.TestPathChildrenCache)
> Run 1: TestPathChildrenCache.testBasics:863 One or more child watchers
> are still registered: [/test]
> Run 2: TestPathChildrenCache.testBasics:863 One or more child watchers
> are still registered: [/test]
>
> org.apache.curator.framework.recipes.cache.TestPathChildrenCache.testBasicsOnTwoCachesWithSameExecutor(org.apache.curator.framework.recipes.cache.TestPathChildrenCache)
> Run 1: TestPathChildrenCache.testBasicsOnTwoCachesWithSameExecutor:934
> One or more child watchers are still registered: [/test]
> Run 2: TestPathChildrenCache.testBasicsOnTwoCachesWithSameExecutor:934
> One or more child watchers are still registered: [/test]
>
> org.apache.curator.framework.recipes.cache.TestPathChildrenCache.testEnsurePath(org.apache.curator.framework.recipes.cache.TestPathChildrenCache)
> Run 1: TestPathChildrenCache.testEnsurePath:363 One or more child
> watchers are still registered: [/one/two/three]
> Run 2: TestPathChildrenCache.testEnsurePath:363 One or more child
> watchers are still registered: [/one/two/three]
>
> TestInterProcessSemaphoreCluster.testCluster:294 expected [true] but
> found [false]
> org.apache.curator.framework.recipes.shared.TestSharedCount.testMultiClientVersioned(org.apache.curator.framework.recipes.shared.TestSharedCount)
> Run 1: PASS
> Run 2: TestSharedCount.testMultiClientVersioned:256 One or more data
> watchers are still registered: [/count]
>
> org.apache.curator.framework.recipes.shared.TestSharedCount.testSimple(org.apache.curator.framework.recipes.shared.TestSharedCount)
> Run 1: TestSharedCount.testSimple:174 One or more data watchers are still
> registered: [/count]
> Run 2: TestSharedCount.testSimple:174 One or more data watchers are still
> registered: [/count]
>
>
> Tests run: 491, Failures: 6, Errors: 0, Skipped: 0
>
>
> On Fri, May 27, 2016 at 3:30 AM, Jordan Zimmerman <
> [email protected]> wrote:
>
>> I see the problem. The fix is not simple though so I’ll spend some time on
>> it. The TL;DR is that exists watchers are still supposed to get set when
>> there is a KeeperException.NoNode and the code isn’t handling it. But,
>> while I was looking at the code I realized there are some significant
>> additional problems. Curator, here, is trying to mirror what ZooKeeper does
>> internally which is insanely complicated. In hindsight, the whole ZK
>> watcher mechanism should’ve been decoupled from the mutator APIs. But, of
>> course, that’s easy for me to say now.
>>
>> -Jordan
>>
>>> On May 26, 2016, at 1:10 AM, Cameron McKenzie <[email protected]>
>> wrote:
>>>
>>> Thanks Scott,
>>> Those tests are now passing for me.
>>>
>>> Jordan, testNodeCache:testBasics() is failing consistently on the 3.0
>>> branch. It appears that this is actually potentially a bug in the
>>> NodeCache. It ends up leaking a Watcher reference. I've had a quick look
>>> through, but I haven't dived in in any detail. It's the
>>> WatcherRemovalManager stuff I think. If you've got time, can you have a
>>> look? If not, let me know and I'll do some more digging.
>>>
>>> cheers
>>>
>>> On Thu, May 26, 2016 at 11:47 AM, Cameron McKenzie <
>> [email protected]>
>>> wrote:
>>>
>>>> Thanks Scott.
>>>>
>>>> Push the fix to master and merge it into 3.0.
>>>>
>>>> Then I guess, I'll push new versions of 2.11 and 3.2 onto Nexus.
>>>> cheers
>>>>
>>>> On Thu, May 26, 2016 at 11:44 AM, Scott Blum <[email protected]>
>>>> wrote:
>>>>
>>>>> Alright, I have a fix, but it wants to be applied to both master and
>> 3.0.
>>>>> Where should I push the fix?
>>>>>
>>>>> On Wed, May 25, 2016 at 6:10 PM, Cameron McKenzie <
>> [email protected]
>>>>>>
>>>>> wrote:
>>>>>
>>>>>> Thanks Scott,
>>>>>> If you just checkout the CURATOR-3.0 branch, they are failing there.
>>>>>> cheers
>>>>>>
>>>>>> On Thu, May 26, 2016 at 2:06 AM, Scott Blum <[email protected]>
>>>>> wrote:
>>>>>>
>>>>>>> Sure, what SHA are they failing at Cam?
>>>>>>>
>>>>>>> On Wed, May 25, 2016 at 9:36 AM, Jordan Zimmerman <
>>>>>>> [email protected]> wrote:
>>>>>>>
>>>>>>>> Scott can you take a look?
>>>>>>>>
>>>>>>>> -Jordan
>>>>>>>>
>>>>>>>>> On May 25, 2016, at 4:35 AM, Cameron McKenzie <
>>>>>> [email protected]>
>>>>>>>> wrote:
>>>>>>>>>
>>>>>>>>> Tree cache tests are still failing. I've tried a few times but no
>>>>>> love:
>>>>>>>>>
>>>>>>>>> TestTreeCacheEventOrdering>TestEventOrdering.testEventOrdering:151
>>>>>>>> actual 6
>>>>>>>>> expected -31:
>>>>>>>>>
>>>>>>>>> I will have a look into what's going on in the morning. Given that
>>>>>>> these
>>>>>>>>> may take some messing about to fix up, do we just want to vote on
>>>>>>> 2.11.0
>>>>>>>>> separately, as that is all ready to go?
>>>>>>>>> cheers
>>>>>>>>>
>>>>>>>>> On Wed, May 25, 2016 at 5:34 PM, Jordan Zimmerman <
>>>>>>>>> [email protected]> wrote:
>>>>>>>>>
>>>>>>>>>> Great news. Thanks.
>>>>>>>>>>
>>>>>>>>>> ====================
>>>>>>>>>> Jordan Zimmerman
>>>>>>>>>>
>>>>>>>>>>> On May 25, 2016, at 12:37 AM, Cameron McKenzie <
>>>>>>> [email protected]
>>>>>>>>>
>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>> I have fixed up the test case, all good now.
>>>>>>>>>>>
>>>>>>>>>>> On Wed, May 25, 2016 at 1:45 PM, Cameron McKenzie <
>>>>>>>>>> [email protected]>
>>>>>>>>>>> wrote:
>>>>>>>>>>>
>>>>>>>>>>>> Looks like it was introduced with the schema validation stuff.
>>>>> It
>>>>>>> now
>>>>>>>>>> does
>>>>>>>>>>>> an ACL check prior to the backgrounding call. Because the unit
>>>>>> test
>>>>>>>>>> uses a
>>>>>>>>>>>> bogus ACL provider it just throws an exception
>>>>>>>>>>>>
>>>>>>>>>>>> final String adjustedPath =
>>>>>>>>>>>> adjustPath(client.fixForNamespace(givenPath,
>>>>>>>>>> createMode.isSequential()));
>>>>>>>>>>>> List<ACL> aclList = acling.getAclList(adjustedPath);
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>> client.getSchemaSet().getSchema(givenPath).validateCreate(createMode,
>>>>>>>>>> data,
>>>>>>>>>>>> aclList);
>>>>>>>>>>>>
>>>>>>>>>>>> String returnPath = null;
>>>>>>>>>>>> if ( backgrounding.inBackground() )
>>>>>>>>>>>> {
>>>>>>>>>>>> pathInBackground(adjustedPath, data, givenPath);
>>>>>>>>>>>>
>>>>>>>>>>>> So, I guess the answer is to get the test to force a failure
>>>>> in a
>>>>>>>>>>>> different way. With the UnhandledErrorListener, the
>>>>> expectation is
>>>>>>>> that
>>>>>>>>>>>> this only gets called on backgrounding operations?
>>>>>>>>>>>> cheers
>>>>>>>>>>>>
>>>>>>>>>>>> On Wed, May 25, 2016 at 1:39 PM, Cameron McKenzie <
>>>>>>>>>> [email protected]>
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> Same on the master branch, but it passes there, so maybe
>>>>>> something
>>>>>>>> has
>>>>>>>>>>>>> legitimately broken the test. Will let you know if I get
>>>>> stuck.
>>>>>>>>>>>>> cheers
>>>>>>>>>>>>>
>>>>>>>>>>>>> On Wed, May 25, 2016 at 1:36 PM, Jordan Zimmerman <
>>>>>>>>>>>>> [email protected]> wrote:
>>>>>>>>>>>>>
>>>>>>>>>>>>>> Let me know if you need help.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> It might be a bad merge. Have you compared it to the master
>>>>>>> branch?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> -JZ
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> On May 24, 2016, at 10:31 PM, Cameron McKenzie <
>>>>>>>>>> [email protected]>
>>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Guys,
>>>>>>>>>>>>>>> There's a test TestFrameworkBackground:testErrorListener
>>>>> that
>>>>>> is
>>>>>>>>>>>>>> failing
>>>>>>>>>>>>>>> consistently on the CURATOR-3.0 branch.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I can't see how it has ever worked. It seems to try and
>>>>> provoke
>>>>>>> an
>>>>>>>>>>>>>> error
>>>>>>>>>>>>>>> via a bad ACL provider.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> But this ACL provider is called by the CreateBuilderImpl
>>>>> prior
>>>>>> to
>>>>>>>> the
>>>>>>>>>>>>>>> backgrounding call, which means that the exception that it
>>>>>> throws
>>>>>>>>>>>>>> happens
>>>>>>>>>>>>>>> in the main Thread of the unit test. So, it just throws an
>>>>>>>>>>>>>>> UnsupportedOperationException which is propogated up the
>>>>> stack
>>>>>> at
>>>>>>>>>> which
>>>>>>>>>>>>>>> point the unit test fails.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> So, I will look at fixing this, but I just don't understand
>>>>> how
>>>>>>> it
>>>>>>>>>> ever
>>>>>>>>>>>>>>> worked?
>>>>>>>>>>>>>>> cheers
>>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>>
>>
>>