Guarantees of an "exists" watcher?

jlindwall Mon, 04 May 2015 13:03:25 -0700

I am wondering if I am heading down a bad path.  We have implemented
distributed locking in zookeeper by hand.  A lock is acquired by creating a
znode; it is released by deleting the znode.  Simply right?  That part works
great.


Here's where the complexity comes in: We also needed be able to detect if a
lock was "broken" -- i.e. the znode was deleted by some third party.  For
this we use an "exists" watcher that is installed via an asynchronous call
immediately after the zone is created.  Of course, the "exists" handler is
invoked whenever the znode is deleted whether by a third party or by the
client itself.  We keep state in the client to remember what znodes were
previously locked.   

An issue was discovered in which a client locks, unlocks, and then re-locks
the same data.  It is possible for the "exists" callback to be delayed and
not get delivered until the data is locked the second time.  This leads to
what we call a "leaked lock", since the znode is created in zookeeper but
the client will not unlock it, since it thinks the znode was already
deleted.   I'm working on a fix for this issue also.

It just seems to be getting more complex and risky. I am wondering if I am
going astray.

Are the watches reliable enough to _guarantee_ that I will receive 1
callback for each delete event?  Even if a session fails-over to another
node?

John



--
View this message in context: 
http://zookeeper-user.578899.n2.nabble.com/Guarantees-of-an-exists-watcher-tp7581088.html
Sent from the zookeeper-user mailing list archive at Nabble.com.

Guarantees of an "exists" watcher?

Reply via email to