Hi, I took a look at the code (which I haven't looked at in 5 or more years). It looks like the reconnection behavior _is_ different. Persistent watches will miss some events that other watches are getting. This is indeed a very long-standing bug.
I'd be willing to work on this, but there's likely devs who are more familiar with the code now who can do it. -JZ > On Jul 25, 2025, at 8:06 PM, Jordan Zimmerman <[email protected]> > wrote: > > Hi, > > Persistent watches are the same watch as every other watch. It all goes > through the same code. Let's look at the doc: > > > Because standard watches are one time triggers and there is latency between > > getting the event and sending a new request to > > get a watch you cannot reliably see every change that happens to a node in > > ZooKeeper. Be prepared to handle the case where > > the znode changes multiple times between getting the event and setting the > > watch again. (You may not care, but at least realize it may happen.) > > ZooKeeper does not keep any kind of queue of events. You cannot count on > seeing every event in ZooKeeper. Watchers are triggered as events happen. > Again, it's been a very long time since I've looked at the code but this is > my memory of how it works. When I wrote Persistent watches, I used all > the existing watch code. A Persistent watch is the exact same code path as > all other watches. They only difference is that they don't get deleted after > firing. Also, recursive watches trigger for child nodes being watched. But, > again, same code path. > > I hope this helps. > > -JZ > > >> On Jul 25, 2025, at 7:30 PM, Li Wang <[email protected]> wrote: >> >> Thanks for the input, Jordan. >> >> My understanding is that the standard watches do but persistent watches >> don't. Not sure if I miss anything or if this is a bug. Looking forward to >> any feedback/input on this. >> >> 1. We have the following in the standard watch section of Zookeeper >> documentation and it looks like missing notifications are triggered. >> >> When a client reconnects, any previously registered watches will be >>> reregistered and triggered if needed. >> >> >> >> https://zookeeper.apache.org/doc/r3.9.3/zookeeperProgrammers.html#sc_WatchSemantics >> >> >> 2. In the code base, Zookeeper client library maintains lastZXid in memory >> and sends it to the server when resetting watches upon reconnection. The >> server detects if any missing notifications need to be triggered based on >> the lastZxid. >> >> https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/ClientCnxn.java#L1040-L1041 >> https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/DataTree.java#L1497 >> >> 3. The problem is that missing notifications seem only being triggered for >> standard watches but not for persistent watches when reconnecting. >> >> For example, for standard watches, watches.process() is invoked for sending >> missing notifications. >> >> for (String path : dataWatches) { >>> DataNode node = getNode(path); >>> if (node == null) { >>> watcher.process(new WatchedEvent(EventType.NodeDeleted, >>> KeeperState.SyncConnected, path)); >>> } else if (node.stat.getMzxid() > relativeZxid) { >>> watcher.process(new >>> WatchedEvent(EventType.NodeDataChanged, KeeperState.SyncConnected, path)); >>> } else { >>> this.dataWatches.addWatch(path, watcher); >>> } >>> } >> >> >> https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/DataTree.java#L1494-L1521 >> >> However, for persistence watches, we only register the watches, not >> detecting and sending missing notifications. >> >> for (String path : persistentRecursiveWatches) { >>> this.dataWatches.addWatch(path, watcher, >>> WatcherMode.PERSISTENT_RECURSIVE); >>> } >> >> >> https://github.com/apache/zookeeper/blob/master/zookeeper-server/src/main/java/org/apache/zookeeper/server/DataTree.java#L1494-L1521 >> >> Thanks, >> >> Li >
