Hi all,

There is a jira issue:
https://issues.apache.org/jira/browse/ZOOKEEPER-4698, it has links to
more context.

Best,
Kezhu Wang

On Sat, Jul 26, 2025 at 5:06 AM Jordan Zimmerman
<jor...@jordanzimmerman.com> wrote:
>
> Here's a summary:
>
> On reconnect, watches are reset. For Data watches, if the node no longer 
> exists, the watch will get NodeDeleted. If the node's zxId is different, the 
> watch will get NodeDataChanged. Exist and child nodes have similar handling. 
> Persistent watches, on the other hand, are merely reset.
>
> I no longer remember why we didn't mimic this for Persistent watches. I guess 
> it can be argued that it isn't necessary or that it could result if a _lot_ 
> of persistent watch calls. Maybe the right thing to do is to just document 
> the difference and leave it as it's been this way for years.
>
> -Jordan
>
> > On Jul 25, 2025, at 9:58 PM, Keith Turner <ktur...@apache.org> wrote:
> >
> >
> >
> > On 2025/07/25 19:23:41 Jordan Zimmerman wrote:
> >> Hi,
> >>
> >> I took a look at the code (which I haven't looked at in 5 or more years). 
> >> It looks like the reconnection behavior _is_ different. Persistent watches 
> >> will miss some events that other watches are getting. This is indeed a 
> >> very long-standing bug.
> >
> > What events are missed for persistent recursive watchers that normal 
> > watcher see?
> >
> >>
> >> I'd be willing to work on this, but there's likely devs who are more 
> >> familiar with the code now who can do it.
> >>
> >> -JZ
> >>
> >>> On Jul 25, 2025, at 8:06 PM, Jordan Zimmerman 
> >>> <jor...@jordanzimmerman.com> wrote:
> >>>
> >>> Hi,
> >>>
> >>> Persistent watches are the same watch as every other watch. It all goes 
> >>> through the same code. Let's look at the doc:
> >>>
> >>>> Because standard watches are one time triggers and there is latency 
> >>>> between getting the event and sending a new request to
> >>>> get a watch you cannot reliably see every change that happens to a node 
> >>>> in ZooKeeper. Be prepared to handle the case where
> >>>> the znode changes multiple times between getting the event and setting 
> >>>> the watch again. (You may not care, but at least realize it may happen.)
> >>>
> >>> ZooKeeper does not keep any kind of queue of events. You cannot count on 
> >>> seeing every event in ZooKeeper. Watchers are triggered as events happen.
> >>> Again, it's been a very long time since I've looked at the code but this 
> >>> is my memory of how it works. When I wrote Persistent watches, I used all
> >>> the existing watch code. A Persistent watch is the exact same code path 
> >>> as all other watches. They only difference is that they don't get deleted 
> >>> after
> >>> firing. Also, recursive watches trigger for child nodes being watched. 
> >>> But, again, same code path.
> >>>
> >>> I hope this helps.
> >>>
> >>> -JZ
> >>>
> >>>
> >>>> On Jul 25, 2025, at 7:30 PM, Li Wang <li4w...@gmail.com> wrote:
> >>>>
> >>>> Thanks for the input, Jordan.
> >>>>
> >>>> My understanding is that the standard watches do but persistent watches
> >>>> don't. Not sure if I miss anything or if this is a bug. Looking forward 
> >>>> to
> >>>> any feedback/input on this.
> >>>>
> >>>> 1.  We have the following in the standard watch section of Zookeeper
> >>>> documentation and it looks like missing notifications are triggered.
> >>>>
> >>>> When a client reconnects, any previously registered watches will be
> >>>>> reregistered and triggered if needed.
> >>>>
> >>>>
> >>>>
> >>>> https://zookeeper.apache.org/doc/r3.9.3/zookeeperProgrammers.html#sc_WatchSemantics
> >>>>
> >>>>
> >>>> 2. In the code base, Zookeeper client library maintains lastZXid in 
> >>>> memory
> >>>> and sends it to the server when resetting watches upon reconnection. The
> >>>> server detects if any missing notifications need to be triggered based on
> >>>> the lastZxid.
> >>>>
> >>>> https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxn.java#L1040-L1041
> >>>> https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/DataTree.java#L1497
> >>>>
> >>>> 3. The problem is that missing notifications seem only being triggered 
> >>>> for
> >>>> standard watches but not for persistent watches when reconnecting.
> >>>>
> >>>> For example, for standard watches, watches.process() is invoked for 
> >>>> sending
> >>>> missing notifications.
> >>>>
> >>>> for (String path : dataWatches) {
> >>>>>           DataNode node = getNode(path);
> >>>>>           if (node == null) {
> >>>>>               watcher.process(new WatchedEvent(EventType.NodeDeleted,
> >>>>> KeeperState.SyncConnected, path));
> >>>>>           } else if (node.stat.getMzxid() > relativeZxid) {
> >>>>>               watcher.process(new
> >>>>> WatchedEvent(EventType.NodeDataChanged, KeeperState.SyncConnected, 
> >>>>> path));
> >>>>>           } else {
> >>>>>               this.dataWatches.addWatch(path, watcher);
> >>>>>           }
> >>>>>       }
> >>>>
> >>>>
> >>>> https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/DataTree.java#L1494-L1521
> >>>>
> >>>> However, for persistence watches, we only register the watches, not
> >>>> detecting and sending missing notifications.
> >>>>
> >>>> for (String path : persistentRecursiveWatches) {
> >>>>>           this.dataWatches.addWatch(path, watcher,
> >>>>> WatcherMode.PERSISTENT_RECURSIVE);
> >>>>>       }
> >>>>
> >>>>
> >>>> https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/DataTree.java#L1494-L1521
> >>>>
> >>>> Thanks,
> >>>>
> >>>> Li
> >>>
> >>
> >>
>

Reply via email to