Well these are prod clusters so my ability to experiment is rather limited. I can take a copy of the snapshot and try both 3 node and 5 in a test cluster.
One thing I forgot to mention is that in most clusters the number of election notification log lines I see is typically, give or take, the same as the number of participants. In this cluster however, it's typically 2 or 3 times as many notifications as the number of participants. My gut feeling is it's more likely to be due to load as the 5 node cluster is much busier and the election time has been increasing over time (as has load). I have no idea exactly what load though, whether it's number of clients, frequency of transactions, total data size, etc. I don't understand why though but that may just be my limited knowledge of the election protocol. Karol > On 28 Apr 2015, at 19:54, Camille Fournier <[email protected]> wrote: > > Just out of curiosity, if you start the 5 node cluster up with only 3 of > the nodes to begin with (like, config 5, but only bring up 3 processes), > does it speed up the leader election or is it still slow? > > C > > On Tue, Apr 28, 2015 at 1:41 PM, Karol Dudzinski <[email protected]> > wrote: > >> Hi, >> >> We're seeing some rather strange leader election in one of our clusters. >> The duration reported by the "FOLLOWING - LEADER ELECTION TOOK" log line >> (and equivalent for the leader) seems to vary hugely. During one rolling >> reboot, I saw the number reported as small as 39ms and as large as 57 >> seconds (difference in units is not a typo). The average is just about 10 >> seconds and std dev also about 10 seconds. So the time taken is not only >> quite large, it's also very variable. >> >> We have other clusters but the average election time in those is in the >> hundreds of millis with std dev in a similar ballpark. I guess one >> difference is the "slow" cluster is 5 participants while the others are 3, >> which may be a factor but I wouldn't expect it to make two orders of >> magnitude difference! >> >> So my question is, what factors contribute to the election time reported >> by these log lines? And what can we do to speed this up? >> >> As far as I understand from logs and a quick browse through the code that >> time is the time to select a leader. Syncing up to the leader happens >> after that. The syncing part I can understand will vary depending on load >> but I don't see why selecting the leader would. >> >> Thanks, >> Karol
