When using zookeeper for leader election or distributed locking, my assumption is that as soon as the lock owner sees session transition to 'CONNECTING' state, it should commit suicide to avoid the risk of multiple owners. Please correct my assumption if I am wrong or there is a better way to guarantee a single lock owner/leader.
If above assumption is correct, I am trying to figure out how I can improve the availability of the application (leader/lock owner) when zookeeper ensemble is broken (eg. undergoing zookeeper leader election for prolonged period of time). Options I have considered: 1/ use multiple ensembles for leader election/locking to avoid SPOF (complex to implement) 2/ extend the zookeeper protocol to provide client more info on connection loss, like zookeeper leader election in progress so that client can decide when it is ok to not commit suicide and still guarantee a single application leader/lock owner. (haven't been able to prove that this will guarantee single application leader/lock owner). If this has been already answered or solved, please point me to the post/doc. -- View this message in context: http://zookeeper-user.578899.n2.nabble.com/locking-leader-election-and-dealing-with-session-loss-tp7581277.html Sent from the zookeeper-user mailing list archive at Nabble.com.
