On Tue, Jul 24, 2012 at 6:08 AM, Jack Luo <[email protected]> wrote: > Hi All, > > I am using Zookeeper3.3.5 for a distributed project. During the test, a > watch related issue is found. Our monitor program places 100 watches on 100 > different paths (e.g. /goo1 …. /goo100) for monitoring the data change, and > another writer program updates one of paths at a specified interval. We > found sometimes some data change notification messages are lost when the > monitor program is switched to a new server due to the failure of current > server. > > I check the “watch management” section in current release notes > http://zookeeper.apache.org/doc/trunk/releasenotes.html and find a statement > “In this release the client library tracks watches that a client has > registered and reregisters the watches when a connection is made to a new > server.” So based on the information, look like during server failover it is > expected behavior to lose data change notifications before watches are > successfully re-registered in a new server. >
See the programmer's guide here: http://zookeeper.apache.org/doc/r3.3.5/zookeeperProgrammers.html#ch_zkWatches "When a client reconnects, any previously registered watches will be reregistered and triggered if needed. In general this all occurs transparently. There is one case where a watch may be missed: a watch for the existance of a znode not yet created will be missed if the znode is created and deleted while disconnected." so really you should not lose any notifications in this case. > The solution that I figure out to this issue is to query all 100 paths to > check if there is any data change after the monitor program is connected to > a new server. > > However if we need to monitor 1000 or 10K paths, this solution may not be > good. Can anyone suggest a better solution to this issue? > > Furthermore, can ZK service is enhanced to replicate the watches on each ZK > server to solve this issue forever? > The client maintains the zxid of the last change it saw from the server. When it re-registers it will be notified of any changes since that zxid. So really this is already supported. Sounds like a bug to me, but I've not heard of any such issues from our users. Patrick
