Hi Flavio, Thanks for your reply. The situation is that indeed all the servers are in a bad state so it looks like we will have to perform a cluster restart.
We played with attempts to optimize the downtime along the lines you suggested. In testing it we ran into the issue where a server with no Listener thread can initiate a leader election connection to a newly-restarted server that does have a Listener. The result is a quorum that may include 'bad' servers, even a 'bad' leader. So we tried to first restart the higher-id servers, because lower-id servers will drop their leader-election connections to higher id servers. I'm told there are issues with this flow as well but have not yet investigated the details. I also worry about the leader-election retries done with exponential backoff. I guess we will play with things a bit more but at this point I am tending towards a simple parallel restart of all servers.. Once the clusters are healthy again we will do a rolling upgrade to 3.4.8 sometime soon. Thanks again, Guy On Sun, Aug 28, 2016 at 5:52 PM, Flavio Junqueira <[email protected]> wrote: > Hi Guy, > > We don't have a way to restart the listener thread, so you really need to > bounce the server. I don't think there is a way of doing this without > forcing a leader election, assuming all your servers are in this bad state. > To minimize downtime, one thing you can do is to avoid bouncing the current > leader until it loses quorum support. Once it loses quorum support, you > have a quorum of healthy servers and they will elect a new, healthy leader. > At the point, you can bounce all your unhealthy servers. > > You may also want to move to a later 3.4 release. > > -Flavio > > > On 24 Aug 2016, at 23:15, Guy Laden <[email protected]> wrote: > > > > Hi all, > > > > It looks like due to a security scan sending "bad" traffic to the leader > > election port, we have clusters in which > > the leader election Listener thread is dead (unchecked exception was > thrown > > and thread died - seen in the log). > > (This seems to be fixed by fixed in > > https://issues.apache.org/jira/browse/ZOOKEEPER-2186) > > > > In this state, when a healthy server comes up and tries to connecnt to > the > > quorum, it gets stuck on > > the leader election. It establishes TCP connections to the other servers > > but any traffic it sends seems > > to get stuck in the receiver's TCP Recv queue (seen with netstat), and is > > not read/processed by zk. > > > > Not a good place to be :) > > > > This is with 3.4.6 > > > > Is there a way to get such clusters back to a healthy state without loss > of > > quorum / client impact? > > Some way of re-starting the listener thread? or restarting the servers > in a > > certain order? > > e.g. If I restart a minority, say the ones with lower server id's - is > > there a way to get the majority servers > > to re-initiate leader election connections with them so as to connect > them > > to the quorum? (and to do this without > > the majority losing quorum). > > > > Thanks, > > Guy > >
