I'm not sure what you're suggesting, David. Could you be more specific, please?
-Flavio > On 30 Aug 2016, at 23:54, David Brower <[email protected]> wrote: > > Anything you could do with iptables you can do in the process by having it > drop connections from things not on a whitelist, and not having a thread > waiting indefinitely for operations from any connection. > > -dB > > > On 8/30/2016 2:46 PM, Flavio Junqueira wrote: >> I was trying to write down an analysis and I haven't been able to come up >> with anything that is foolproof. Basically, the two main issues are: >> >> - A bad server is able to connect to a good server in the case it has a >> message outstanding and is trying to establish a connection to the good >> server. This happens if the server is LOOKING or has an outstanding message >> from the previous round. The converse isn't true, though. A good server >> can't start a connection to a bad server because the bad server doesn't have >> a listener. >> - If we bounce servers sequentially, there is a chance that a bad server is >> elected more than once along the process, which induces multiple leader >> election rounds. >> >> Perhaps this is overkill, but I was wondering if it makes sense to filter >> election traffic to and from bad servers using, for example, iptables. The >> idea is to a rule that are local to each server preventing the server to get >> connections established for leader election. For each bad server, we stop >> it, remove the rule, and bring it back up. We also stop a minority first >> before stoping the bad leader. >> >> -Flavio >> >>> On 29 Aug 2016, at 09:29, Guy Laden <[email protected]> wrote: >>> >>> Hi Flavio, Thanks for your reply. The situation is that indeed all the >>> servers are in a bad state so it looks like we will have to perform a >>> cluster restart. >>> >>> We played with attempts to optimize the downtime along the lines you >>> suggested. In testing it we ran into the issue where a server with no >>> Listener thread can initiate a leader election connection to a >>> newly-restarted server that does have a Listener. The result is a quorum >>> that may include 'bad' servers, even a 'bad' leader. So we tried to first >>> restart the higher-id servers, because lower-id servers will drop their >>> leader-election connections to higher id servers. >>> I'm told there are issues with this flow as well but have not yet >>> investigated the details. >>> I also worry about the leader-election retries done with exponential >>> backoff. >>> >>> I guess we will play with things a bit more but at this point I am tending >>> towards a simple parallel restart of all servers.. >>> >>> Once the clusters are healthy again we will do a rolling upgrade to 3.4.8 >>> sometime soon. >>> >>> Thanks again, >>> Guy >>> >>> >>> On Sun, Aug 28, 2016 at 5:52 PM, Flavio Junqueira <[email protected]> wrote: >>> >>>> Hi Guy, >>>> >>>> We don't have a way to restart the listener thread, so you really need to >>>> bounce the server. I don't think there is a way of doing this without >>>> forcing a leader election, assuming all your servers are in this bad state. >>>> To minimize downtime, one thing you can do is to avoid bouncing the current >>>> leader until it loses quorum support. Once it loses quorum support, you >>>> have a quorum of healthy servers and they will elect a new, healthy leader. >>>> At the point, you can bounce all your unhealthy servers. >>>> >>>> You may also want to move to a later 3.4 release. >>>> >>>> -Flavio >>>> >>>>> On 24 Aug 2016, at 23:15, Guy Laden <[email protected]> wrote: >>>>> >>>>> Hi all, >>>>> >>>>> It looks like due to a security scan sending "bad" traffic to the leader >>>>> election port, we have clusters in which >>>>> the leader election Listener thread is dead (unchecked exception was >>>> thrown >>>>> and thread died - seen in the log). >>>>> (This seems to be fixed by fixed in >>>>> https://issues.apache.org/jira/browse/ZOOKEEPER-2186) >>>>> >>>>> In this state, when a healthy server comes up and tries to connecnt to >>>> the >>>>> quorum, it gets stuck on >>>>> the leader election. It establishes TCP connections to the other servers >>>>> but any traffic it sends seems >>>>> to get stuck in the receiver's TCP Recv queue (seen with netstat), and is >>>>> not read/processed by zk. >>>>> >>>>> Not a good place to be :) >>>>> >>>>> This is with 3.4.6 >>>>> >>>>> Is there a way to get such clusters back to a healthy state without loss >>>> of >>>>> quorum / client impact? >>>>> Some way of re-starting the listener thread? or restarting the servers >>>> in a >>>>> certain order? >>>>> e.g. If I restart a minority, say the ones with lower server id's - is >>>>> there a way to get the majority servers >>>>> to re-initiate leader election connections with them so as to connect >>>> them >>>>> to the quorum? (and to do this without >>>>> the majority losing quorum). >>>>> >>>>> Thanks, >>>>> Guy >>>> >
