See here: http://curator.apache.org/errors.html
You should set a ConnectionStateListener. If you get SUSPENDED, assume that your lock is lost and cancel whatever activity you are doing. -Jordan On February 11, 2015 at 2:32:31 AM, Jipeng Tan ([email protected]) wrote: Hi All, I would like ask a question about the best practice to handle leader re-election when using InterProcessSemaphoreMutex. Here is the background: I am using curator to as a task coordinator for a list of concurrent running jobs. For every minute, every worker (each work run on a separate VM) try to acquire a task (lock) from zk by calling: lock = new InterProcessSemaphoreMutex(zkClient, task); boolean hasLock = false; hasLock = lock.acquire(1, TimeUnit.SECONDS); If the work get the lock, it will do the task. The issue is somehow the Zookeeper Quorum is not stable so that the leader re-election happens around twice a week. If leader re-election happens, all the VMs lost connection to zk quorum and they all need to re-reconnect to zk quorum again. Sometime I observed one VM could own two locks at the same time. So I am wonder what should do in order to avoid the impact of leader re-election. Thanks, Jipeng
