Here's a summary: On reconnect, watches are reset. For Data watches, if the node no longer exists, the watch will get NodeDeleted. If the node's zxId is different, the watch will get NodeDataChanged. Exist and child nodes have similar handling. Persistent watches, on the other hand, are merely reset.
I no longer remember why we didn't mimic this for Persistent watches. I guess it can be argued that it isn't necessary or that it could result if a _lot_ of persistent watch calls. Maybe the right thing to do is to just document the difference and leave it as it's been this way for years. -Jordan > On Jul 25, 2025, at 9:58 PM, Keith Turner <ktur...@apache.org> wrote: > > > > On 2025/07/25 19:23:41 Jordan Zimmerman wrote: >> Hi, >> >> I took a look at the code (which I haven't looked at in 5 or more years). It >> looks like the reconnection behavior _is_ different. Persistent watches will >> miss some events that other watches are getting. This is indeed a very >> long-standing bug. > > What events are missed for persistent recursive watchers that normal watcher > see? > >> >> I'd be willing to work on this, but there's likely devs who are more >> familiar with the code now who can do it. >> >> -JZ >> >>> On Jul 25, 2025, at 8:06 PM, Jordan Zimmerman <jor...@jordanzimmerman.com> >>> wrote: >>> >>> Hi, >>> >>> Persistent watches are the same watch as every other watch. It all goes >>> through the same code. Let's look at the doc: >>> >>>> Because standard watches are one time triggers and there is latency >>>> between getting the event and sending a new request to >>>> get a watch you cannot reliably see every change that happens to a node in >>>> ZooKeeper. Be prepared to handle the case where >>>> the znode changes multiple times between getting the event and setting the >>>> watch again. (You may not care, but at least realize it may happen.) >>> >>> ZooKeeper does not keep any kind of queue of events. You cannot count on >>> seeing every event in ZooKeeper. Watchers are triggered as events happen. >>> Again, it's been a very long time since I've looked at the code but this is >>> my memory of how it works. When I wrote Persistent watches, I used all >>> the existing watch code. A Persistent watch is the exact same code path as >>> all other watches. They only difference is that they don't get deleted after >>> firing. Also, recursive watches trigger for child nodes being watched. But, >>> again, same code path. >>> >>> I hope this helps. >>> >>> -JZ >>> >>> >>>> On Jul 25, 2025, at 7:30 PM, Li Wang <li4w...@gmail.com> wrote: >>>> >>>> Thanks for the input, Jordan. >>>> >>>> My understanding is that the standard watches do but persistent watches >>>> don't. Not sure if I miss anything or if this is a bug. Looking forward to >>>> any feedback/input on this. >>>> >>>> 1. We have the following in the standard watch section of Zookeeper >>>> documentation and it looks like missing notifications are triggered. >>>> >>>> When a client reconnects, any previously registered watches will be >>>>> reregistered and triggered if needed. >>>> >>>> >>>> >>>> https://zookeeper.apache.org/doc/r3.9.3/zookeeperProgrammers.html#sc_WatchSemantics >>>> >>>> >>>> 2. In the code base, Zookeeper client library maintains lastZXid in memory >>>> and sends it to the server when resetting watches upon reconnection. The >>>> server detects if any missing notifications need to be triggered based on >>>> the lastZxid. >>>> >>>> https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxn.java#L1040-L1041 >>>> https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/DataTree.java#L1497 >>>> >>>> 3. The problem is that missing notifications seem only being triggered for >>>> standard watches but not for persistent watches when reconnecting. >>>> >>>> For example, for standard watches, watches.process() is invoked for sending >>>> missing notifications. >>>> >>>> for (String path : dataWatches) { >>>>> DataNode node = getNode(path); >>>>> if (node == null) { >>>>> watcher.process(new WatchedEvent(EventType.NodeDeleted, >>>>> KeeperState.SyncConnected, path)); >>>>> } else if (node.stat.getMzxid() > relativeZxid) { >>>>> watcher.process(new >>>>> WatchedEvent(EventType.NodeDataChanged, KeeperState.SyncConnected, path)); >>>>> } else { >>>>> this.dataWatches.addWatch(path, watcher); >>>>> } >>>>> } >>>> >>>> >>>> https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/DataTree.java#L1494-L1521 >>>> >>>> However, for persistence watches, we only register the watches, not >>>> detecting and sending missing notifications. >>>> >>>> for (String path : persistentRecursiveWatches) { >>>>> this.dataWatches.addWatch(path, watcher, >>>>> WatcherMode.PERSISTENT_RECURSIVE); >>>>> } >>>> >>>> >>>> https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/DataTree.java#L1494-L1521 >>>> >>>> Thanks, >>>> >>>> Li >>> >> >>