Hi All,
I would like ask a question about the best practice to handle leader
re-election when using InterProcessSemaphoreMutex.
Here is the background:
I am using curator to as a task coordinator for a list of concurrent
running jobs.
For every minute, every worker (each work run on a separate VM) try to
acquire a task (lock) from zk by calling:
lock = new InterProcessSemaphoreMutex(zkClient, task);
boolean hasLock = false;
hasLock = lock.acquire(1, TimeUnit.SECONDS);
If the work get the lock, it will do the task.
The issue is somehow the Zookeeper Quorum is not stable so that the leader
re-election happens around twice a week. If leader re-election happens, all
the VMs lost connection to zk quorum and they all need to re-reconnect to
zk quorum again. Sometime I observed one VM could own two locks at the same
time.
So I am wonder what should do in order to avoid the impact of leader
re-election.
Thanks,
Jipeng