hey Jeff, Your assessment looks pretty spot on to me. I guess it's never come up before as it's an edge case and for most use cases I imagine that the consequences are minimal (the order of leadership just changes and it would take a bit longer for the leadership to occur).
If you'd like to create a PR, preferably with a unit test reproducing the issue if possible, I'd be happy to review it. cheers Cam On Thu, Jun 8, 2017 at 10:15 AM, Jeff Clites <[email protected]> wrote: > At the bottom of the LeaderLatch.checkLeadership() method (in Curator > 2.12.0), the following is called to set a watch on the waiter one spot > ahead of us, as expected: > > String watchPath = sortedChildren.get(ourIndex - 1); > ... > client.getData().usingWatcher(watcher).inBackground( > callback).forPath(ZKPaths.makePath(latchPath, watchPath)); > > When the watch fires and has event type NodeDeleted, then > LeaderLatch.getChildren() is called to determine if we are now the leader, > again as expected. > > But I don't understand the following. The code for the callback (called in > response to the getData() completing) has this: > > if ( event.getResultCode() == KeeperException.Code.NONODE.intValue() ) > { > // previous node is gone - reset > reset(); > } > > It seems that this should be a call to getChildren(), rather than to > reset(), since this means that the waiter ahead of us has disappeared (the > same situation we were watch-ing for, but it happened before we could even > set a watch), whereas reset() would be the right reaction if our path had > disappeared. It looks to me like this code "thinks" it's a watch about our > node, rather than the one-ahead-of-use node. I think this would cause us to > delete and re-create our node, effectively moving us to the back of the > wait queue, rather than making us the leader now. > > Is this a bug, or am I misinterpreting something? > > Looking at the history of this code, it looks like it did in fact used to > check for leadership at this point, and got changed when the code was > updated to use background calls (in commit sha ff4ec29f). > > Thanks in advance, > > Jeff Clites >
