Re: Broker deregisters from ZK, but stays alive and does not rejoin the cluster

2019-03-23 Thread Joe Ammann
I've filed https://issues.apache.org/jira/browse/KAFKA-8151, and tried to keep the descriptions of the different symptoms apart. I have yet to collect detailed information about a case of symptom 2 to happen. And I will try 2.2RC later today CU, Joe On 3/23/19 1:17 AM, Ismael Juma wrote: >

Re: Broker deregisters from ZK, but stays alive and does not rejoin the cluster

2019-03-23 Thread Peter Levart
Hi, Joe I think I observed a similar lockup as you describe in 3rd variant. The controller broker was partialy stuck but other brokers still regarded it as the controller. Unfortunately the broker was restarted by an unpatient admin before I had a chance to investigate. The simpthoms were as

Re: Broker deregisters from ZK, but stays alive and does not rejoin the cluster

2019-03-22 Thread Ismael Juma
I'd suggest filing a single JIRA as a first step. Please test the 2.2 RC before filing if possible. Please include enough details for someone else to reproduce. Thanks! Ismael On Fri, Mar 22, 2019, 3:14 PM Joe Ammann wrote: > Hi Ismael > > I've done a few more tests, and it seems that I'm

Re: Broker deregisters from ZK, but stays alive and does not rejoin the cluster

2019-03-22 Thread Joe Ammann
Hi Ismael I've done a few more tests, and it seems that I'm able to "reproduce" various kinds of problems in Kafka 2.1.1 in out DEV. I can force these by faking an outage of Zookeeper. What I do for my tests is freeze (kill -STOP) 2 out of 3 ZK instances, let the Kafka brokers continue, then

Re: Broker deregisters from ZK, but stays alive and does not rejoin the cluster

2019-03-21 Thread Ismael Juma
Hi Joe, This is not expected behaviour, please file a JIRA. Ismael On Mon, Mar 18, 2019 at 7:29 AM Joe Ammann wrote: > Hi all > > We're running several clusters (mostly with 3 brokers) with 2.1.1 > > We quite regularly see the pattern that one of the 3 brokers "detaches" > from ZK (the broker

Re: Broker deregisters from ZK, but stays alive and does not rejoin the cluster

2019-03-21 Thread Joe Ammann
Hi all I investigated a bit deeper and I came to the conclusion that it's probably expected behaviour, that a broker keeps running after loosing the ZK session and does not necessarily restart or reconnect automatically.

Broker deregisters from ZK, but stays alive and does not rejoin the cluster

2019-03-18 Thread Joe Ammann
Hi all We're running several clusters (mostly with 3 brokers) with 2.1.1 We quite regularly see the pattern that one of the 3 brokers "detaches" from ZK (the broker id is not registered anymore under /brokers/ids). We assume that the root cause for this is that the brokers are running on VMs