I don’t know if anything changed in ZooKeeper itself. I know that the connection states changed in Curator, but Curator now tests both the old mode and the new mode and they both fail here.
-JZ > On Feb 7, 2016, at 1:06 AM, Scott Blum <[email protected]> wrote: > > I need to analyze this a bit deeper, but what I'm seeing on the 3.0 branch is > that the ephemeral node /test/me created in testKilledSession() really isn't > disappearing when it should. > > After the session loss and the reconnect, /test still shows 2 children [foo, > me] and /test/me still returns a node. > > Any idea why the timing here would have changed? > > On Sat, Feb 6, 2016 at 1:41 PM, Scott Blum <[email protected] > <mailto:[email protected]>> wrote: > https://issues.apache.org/jira/browse/CURATOR-302 > <https://issues.apache.org/jira/browse/CURATOR-302> > > I need to trace through what's really going on under the hood rather than > band-aid the test. Should be able to in next couple of days. > > On Sat, Feb 6, 2016 at 7:43 AM, Jordan Zimmerman <[email protected] > <mailto:[email protected]>> wrote: > Any update on this? I think you should create a Jira for it. > > -Jordan > > >> On Feb 5, 2016, at 12:30 PM, Scott Blum <[email protected] >> <mailto:[email protected]>> wrote: >> >> BTW, this test passes on master... so it's some kind of 3.0 vs. master >> issue. I think I'm going to just have to dump in a ton of log messages and >> see what differs. >> >> On Fri, Feb 5, 2016 at 12:25 PM, Jordan Zimmerman >> <[email protected] <mailto:[email protected]>> wrote: >> OK - please create a new Issue in Jira for this. >> >> -Jordan >> >>> On Feb 5, 2016, at 12:24 PM, Scott Blum <[email protected] >>> <mailto:[email protected]>> wrote: >>> >>> BTW: this is broken on CURATOR-3.0 as well, so it appears to have been >>> broken for a while. Maybe I'll have to git bisect... >>> >>> On Fri, Feb 5, 2016 at 12:22 PM, Scott Blum <[email protected] >>> <mailto:[email protected]>> wrote: >>> Okay, so I looked into this for a bit, and I hit kind of a wall. I think >>> there is a legit bug/race in TreeCache, and the following patch *should* >>> remedy: >>> >>> diff --git >>> a/curator-recipes/src/main/java/org/apache/curator/framework/recipes/cache/TreeCache.java >>> >>> b/curator-recipes/src/main/java/org/apache/curator/framework/recipes/cache/TreeCache.java >>> index df4403c..a4a022b 100644 >>> --- >>> a/curator-recipes/src/main/java/org/apache/curator/framework/recipes/cache/TreeCache.java >>> +++ >>> b/curator-recipes/src/main/java/org/apache/curator/framework/recipes/cache/TreeCache.java >>> @@ -303,7 +303,6 @@ public class TreeCache implements Closeable >>> void wasDeleted() throws Exception >>> { >>> ChildData oldChildData = childData.getAndSet(null); >>> - >>> client.watches().remove(this).ofType(WatcherType.Any).locally().inBackground().forPath(path); >>> ConcurrentMap<String, TreeNode> childMap = >>> children.getAndSet(null); >>> if ( childMap != null ) >>> { >>> @@ -807,8 +806,16 @@ public class TreeCache implements Closeable >>> case RECONNECTED: >>> try >>> { >>> + outstandingOps.incrementAndGet(); >>> root.wasReconnected(); >>> publishEvent(TreeCacheEvent.Type.CONNECTION_RECONNECTED); >>> + if ( outstandingOps.decrementAndGet() == 0 ) >>> + { >>> + if ( isInitialized.compareAndSet(false, true) ) >>> + { >>> + publishEvent(TreeCacheEvent.Type.INITIALIZED); >>> + } >>> + } >>> } >>> catch ( Exception e ) >>> { >>> >>> That should guarantee that the initialized event gets deferred until all >>> outstanding refreshes finish.. but it's not. Something seems to have >>> changed under the hood in how background events are getting sent to >>> TreeCache, and I don't really understand it yet. And running the debugger >>> seems to affect the timing, like something racy is going on. :( >>> >>> >>> On Fri, Feb 5, 2016 at 11:57 AM, Scott Blum <[email protected] >>> <mailto:[email protected]>> wrote: >>> Ok, that is kind of weird. I'll take a look. >>> >>> On Fri, Feb 5, 2016 at 4:58 AM, Jordan Zimmerman >>> <[email protected] <mailto:[email protected]>> wrote: >>> No, sorry. The last few lines of the test currently are: >>> >>> assertEvent(TreeCacheEvent.Type.NODE_REMOVED, "/test/me", >>> "data".getBytes()); >>> assertEvent(TreeCacheEvent.Type.INITIALIZED); >>> >>> This fails. But, if I switch them it works: >>> >>> assertEvent(TreeCacheEvent.Type.INITIALIZED); >>> assertEvent(TreeCacheEvent.Type.NODE_REMOVED, "/test/me", >>> "data".getBytes()); >>> >>>> On Feb 5, 2016, at 2:57 AM, Scott Blum <[email protected] >>>> <mailto:[email protected]>> wrote: >>>> >>>> So you end up with 2 initialized events? >>>> >>>> You mean this? >>>> >>>> assertEvent(TreeCacheEvent.Type.CONNECTION_RECONNECTED); >>>> + assertEvent(TreeCacheEvent.Type.INITIALIZED); >>>> assertEvent(TreeCacheEvent.Type.NODE_REMOVED, "/test/me", >>>> "data".getBytes()); >>>> assertEvent(TreeCacheEvent.Type.INITIALIZED); >>>> >>>> Seems weird if there are two, but I can help look. >>>> >>>> On Thu, Feb 4, 2016 at 10:48 PM, Jordan Zimmerman >>>> <[email protected] <mailto:[email protected]>> wrote: >>>> Hey Scott, >>>> >>>> In this branch, TestTreeCache.testKilledSession() is failing at: >>>> >>>> assertEvent(TreeCacheEvent.Type.NODE_REMOVED, "/test/me", >>>> "data".getBytes()); >>>> >>>> However, if I change the two asserts to: >>>> >>>> assertEvent(TreeCacheEvent.Type.INITIALIZED); >>>> assertEvent(TreeCacheEvent.Type.NODE_REMOVED, "/test/me", >>>> "data".getBytes()); >>>> >>>> it works. Does that make any sense? >>>> >>>> -Jordan >>>> >>>> > On Feb 4, 2016, at 9:23 PM, Jordan Zimmerman <[email protected] >>>> > <mailto:[email protected]>> wrote: >>>> > >>>> > Devs, >>>> > >>>> > In trying to fix the bad log message "Failed to find watcher” (which >>>> > turns out to be a ZK client issue), I realize that the NamespaceWatcher >>>> > and WatcherWrapper stuff could be improved. I’m still working on getting >>>> > all tests to pass but I’d appreciate more sets of eyes on this change. >>>> > Please review carefully if you can. >>>> > >>>> > https://github.com/apache/curator/pull/131 >>>> > <https://github.com/apache/curator/pull/131> >>>> > >>>> > -Jordan >>>> >>>> >>> >>> >>> >>> >> >> > > >
