Kezhu Wang created ZOOKEEPER-4870: ------------------------------------- Summary: Proactive leadership transfer Key: ZOOKEEPER-4870 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-4870 Project: ZooKeeper Issue Type: New Feature Components: java client, server Reporter: Kezhu Wang
We do have leadership transfer, but it only happen when we are removing leader in reconfiguration. It would be nice to support it with dedicated API. This way it will be really useful to reduce unavailability during rolling upgrade or leader shutdown. Also, I think it cloud also help zxid rollover. Inheriting leadership in rollover should be similar to leadership transfer in protocol. https://www.usenix.org/conference/atc12/technical-sessions/presentation/shraer {quote} we investigate the effect of reconfigurations removing the leader. Note that a server can never be added to a cluster as leader as we always prioritize the current leader. Figure 8 shows the advan- tage of designating a new leader when removing the cur- rent one, and thus avoiding leader election. It depicts the average time to recover from a leader crash versus the average time to regain system availability following the removal of the leader. The average is taken on 10 executions. We can see that designating a default leader saves up to 1sec, depending on the cluster size. As cluster size increases, leader election takes longer while using a default leader takes constant time regardless of the clus- ter size. Nevertheless, as the figure shows, cluster size always affects total leader recovery time, as it includes synchronizing state with a quorum of followers. {quote} -- This message was sent by Atlassian Jira (v8.20.10#820010)