Pradeep created KAFKA-9829: ------------------------------ Summary: Kafka brokers are un-registered on Zookeeper node replacement Key: KAFKA-9829 URL: https://issues.apache.org/jira/browse/KAFKA-9829 Project: Kafka Issue Type: Bug Affects Versions: 0.10.2.1 Reporter: Pradeep
We have a Kafka cluster with 3 nodes connected to a Zookeeper (3.4.14) cluster of 3 nodes in AWS. We make use of the auto-scaling group to provision nodes upon failures. We are seeing an issue where the Kafka brokers are getting un-registered when all the Zookeeper nodes are replaced one after the other. Every Zookeeper node is terminated from AWS console and we wait for a replacement node to be provisioned with Zookeeper initialized before terminating the other node. On every Zookeeper node replacement, the /broker/ids path show all the Kafka broker ids in the cluster. But only on the final Zookeeper node replacement, the content in /broker/ids become empty. We are seeing below logs in one of the Zookeeper nodes when all of the original nodes are replaced. {{2020-03-26 20:29:20,303 [myid:3] - INFO [[SessionTracker:ZooKeeperServer@355|sessiontracker:ZooKeeperServer@355]] - Expiring session 0x10003b973b50016, timeout of 6000ms exceeded}} {{2020-03-26 20:29:20,303 [myid:3] - INFO [[SessionTracker:ZooKeeperServer@355|sessiontracker:ZooKeeperServer@355]] - Expiring session 0x10003b973b5000e, timeout of 6000ms exceeded}} {{2020-03-26 20:29:20,303 [myid:3] - INFO [[SessionTracker:ZooKeeperServer@355|sessiontracker:ZooKeeperServer@355]] - Expiring session 0x30003a126690002, timeout of 6000ms exceeded}} {{2020-03-26 20:29:20,307 [myid:3] - DEBUG [CommitProcessor:3:DataTree@893] - Deleting ephemeral node /brokers/ids/1002 for session 0x10003b973b50016}} {{2020-03-26 20:29:20,307 [myid:3] - DEBUG [CommitProcessor:3:DataTree@893] - Deleting ephemeral node /brokers/ids/1003 for session 0x10003b973b5000e}} {{2020-03-26 20:29:20,307 [myid:3] - DEBUG [CommitProcessor:3:DataTree@893] - Deleting ephemeral node /controller for session 0x30003a126690002}} {{2020-03-26 20:29:20,307 [myid:3] - DEBUG [CommitProcessor:3:DataTree@893] - Deleting ephemeral node /brokers/ids/1001 for session 0x30003a126690002}} I am not sure if the issue is related to KAFKA-5473. -- This message was sent by Atlassian Jira (v8.3.4#803005)