The Lock recipe has a overview description of "Fully distributed locks that are globally synchronous, meaning at any snapshot in time no two clients think they hold the same lock." We've implemented this pattern but we've run into an issue handling zookeeper errors that seem to violate the semantics of 'no two clients think they have the lock.' for example:
Thread1.Client1.lock(); Thread2.Client2.lock(); // client1 gets the lock so he starts some work Thread1.client1.doWork(); // but now i get a session timeout // in the worst case it's because the doWork() method caused a full GC that took > sessionTimeout // my client then has to reconnect with a new session ID Thread1.client1.reconnect(); But now my question is, how have people handled this case to notify Thread1.client1 that he is no longer holding the lock? Without a lot of pedantic calls to Thread1.client1.doIStillHaveTheLock() inside the doWork() method it seems like 2 clients both think they have the lock. Even if you make repeated calls to check the state of your lock you still have small windows of time where 2 clients are in the lock. i could interrupt Thread1 when reconnecting but if you're using the lock for multithreaded synchronization that won't help. I realize the limitations of zookeeper in this case but i also hope someone else has solved this problem intelligently before. - will
