Hi, We're seeing some rather strange leader election in one of our clusters. The duration reported by the "FOLLOWING - LEADER ELECTION TOOK" log line (and equivalent for the leader) seems to vary hugely. During one rolling reboot, I saw the number reported as small as 39ms and as large as 57 seconds (difference in units is not a typo). The average is just about 10 seconds and std dev also about 10 seconds. So the time taken is not only quite large, it's also very variable.
We have other clusters but the average election time in those is in the hundreds of millis with std dev in a similar ballpark. I guess one difference is the "slow" cluster is 5 participants while the others are 3, which may be a factor but I wouldn't expect it to make two orders of magnitude difference! So my question is, what factors contribute to the election time reported by these log lines? And what can we do to speed this up? As far as I understand from logs and a quick browse through the code that time is the time to select a leader. Syncing up to the leader happens after that. The syncing part I can understand will vary depending on load but I don't see why selecting the leader would. Thanks, Karol
