Any update on this? I think you should create a Jira for it. -Jordan
> On Feb 5, 2016, at 12:30 PM, Scott Blum <[email protected]> wrote: > > BTW, this test passes on master... so it's some kind of 3.0 vs. master issue. > I think I'm going to just have to dump in a ton of log messages and see what > differs. > > On Fri, Feb 5, 2016 at 12:25 PM, Jordan Zimmerman <[email protected] > <mailto:[email protected]>> wrote: > OK - please create a new Issue in Jira for this. > > -Jordan > >> On Feb 5, 2016, at 12:24 PM, Scott Blum <[email protected] >> <mailto:[email protected]>> wrote: >> >> BTW: this is broken on CURATOR-3.0 as well, so it appears to have been >> broken for a while. Maybe I'll have to git bisect... >> >> On Fri, Feb 5, 2016 at 12:22 PM, Scott Blum <[email protected] >> <mailto:[email protected]>> wrote: >> Okay, so I looked into this for a bit, and I hit kind of a wall. I think >> there is a legit bug/race in TreeCache, and the following patch *should* >> remedy: >> >> diff --git >> a/curator-recipes/src/main/java/org/apache/curator/framework/recipes/cache/TreeCache.java >> >> b/curator-recipes/src/main/java/org/apache/curator/framework/recipes/cache/TreeCache.java >> index df4403c..a4a022b 100644 >> --- >> a/curator-recipes/src/main/java/org/apache/curator/framework/recipes/cache/TreeCache.java >> +++ >> b/curator-recipes/src/main/java/org/apache/curator/framework/recipes/cache/TreeCache.java >> @@ -303,7 +303,6 @@ public class TreeCache implements Closeable >> void wasDeleted() throws Exception >> { >> ChildData oldChildData = childData.getAndSet(null); >> - >> client.watches().remove(this).ofType(WatcherType.Any).locally().inBackground().forPath(path); >> ConcurrentMap<String, TreeNode> childMap = >> children.getAndSet(null); >> if ( childMap != null ) >> { >> @@ -807,8 +806,16 @@ public class TreeCache implements Closeable >> case RECONNECTED: >> try >> { >> + outstandingOps.incrementAndGet(); >> root.wasReconnected(); >> publishEvent(TreeCacheEvent.Type.CONNECTION_RECONNECTED); >> + if ( outstandingOps.decrementAndGet() == 0 ) >> + { >> + if ( isInitialized.compareAndSet(false, true) ) >> + { >> + publishEvent(TreeCacheEvent.Type.INITIALIZED); >> + } >> + } >> } >> catch ( Exception e ) >> { >> >> That should guarantee that the initialized event gets deferred until all >> outstanding refreshes finish.. but it's not. Something seems to have >> changed under the hood in how background events are getting sent to >> TreeCache, and I don't really understand it yet. And running the debugger >> seems to affect the timing, like something racy is going on. :( >> >> >> On Fri, Feb 5, 2016 at 11:57 AM, Scott Blum <[email protected] >> <mailto:[email protected]>> wrote: >> Ok, that is kind of weird. I'll take a look. >> >> On Fri, Feb 5, 2016 at 4:58 AM, Jordan Zimmerman <[email protected] >> <mailto:[email protected]>> wrote: >> No, sorry. The last few lines of the test currently are: >> >> assertEvent(TreeCacheEvent.Type.NODE_REMOVED, "/test/me", "data".getBytes()); >> assertEvent(TreeCacheEvent.Type.INITIALIZED); >> >> This fails. But, if I switch them it works: >> >> assertEvent(TreeCacheEvent.Type.INITIALIZED); >> assertEvent(TreeCacheEvent.Type.NODE_REMOVED, "/test/me", "data".getBytes()); >> >>> On Feb 5, 2016, at 2:57 AM, Scott Blum <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> So you end up with 2 initialized events? >>> >>> You mean this? >>> >>> assertEvent(TreeCacheEvent.Type.CONNECTION_RECONNECTED); >>> + assertEvent(TreeCacheEvent.Type.INITIALIZED); >>> assertEvent(TreeCacheEvent.Type.NODE_REMOVED, "/test/me", >>> "data".getBytes()); >>> assertEvent(TreeCacheEvent.Type.INITIALIZED); >>> >>> Seems weird if there are two, but I can help look. >>> >>> On Thu, Feb 4, 2016 at 10:48 PM, Jordan Zimmerman >>> <[email protected] <mailto:[email protected]>> wrote: >>> Hey Scott, >>> >>> In this branch, TestTreeCache.testKilledSession() is failing at: >>> >>> assertEvent(TreeCacheEvent.Type.NODE_REMOVED, "/test/me", >>> "data".getBytes()); >>> >>> However, if I change the two asserts to: >>> >>> assertEvent(TreeCacheEvent.Type.INITIALIZED); >>> assertEvent(TreeCacheEvent.Type.NODE_REMOVED, "/test/me", >>> "data".getBytes()); >>> >>> it works. Does that make any sense? >>> >>> -Jordan >>> >>> > On Feb 4, 2016, at 9:23 PM, Jordan Zimmerman <[email protected] >>> > <mailto:[email protected]>> wrote: >>> > >>> > Devs, >>> > >>> > In trying to fix the bad log message "Failed to find watcher” (which >>> > turns out to be a ZK client issue), I realize that the NamespaceWatcher >>> > and WatcherWrapper stuff could be improved. I’m still working on getting >>> > all tests to pass but I’d appreciate more sets of eyes on this change. >>> > Please review carefully if you can. >>> > >>> > https://github.com/apache/curator/pull/131 >>> > <https://github.com/apache/curator/pull/131> >>> > >>> > -Jordan >>> >>> >> >> >> >> > >
