Hi Harish, I think I see what may be the problem for you. Based on your initial description (6 ZK nodes, 3 down) I think the problem is that you no longer have a quorum. When a Zookeeper cluster is running, updates (i.e. removing znodes) can only occur when Zookeeper has a quorum, which 50.1% of the configured Zookeeper nodes. If I understand correctly, then in your case you have 6 Zookeeper nodes configured but 3 are down. This means that you only have 50.0% of the Zookeeper cluster working, and thus Zookeeper does not have a quorum so no updates can be made. I don't know much about the new TTL feature in 3.5, but my assumption is that it works on this same principle which is that no updates can be made to the cluster's znodes when there is no quorum. The same applies to the 3 Zookeeper node cluster, you must have 2 nodes running to form a quorum and allow any updates to occur.
Please correct me if I missed something.... Thanks, Brian On Tue, Jun 12, 2018 at 1:33 PM, harish lohar <hklo...@gmail.com> wrote: > ---------- Forwarded message --------- > From: harish lohar <hklo...@gmail.com> > Date: Tue, Jun 12, 2018 at 3:26 PM > Subject: Re: Kafka Failing to start due to existing ID > To: <an...@apache.org> > > > Hi Andor, > > Thanks for your reply. > > This issue is irrespective of number of nodes, even should be seen with 3 > Node cluster as well. > > Actually kafka has session_timeout config , but that seems to be in effect > only if zookeeper cluster is up i.e. if kafka goes down when zookeeper > cluster is up. > > Now let's say if 2 nodes of Zookeeper cluster is down , and then if kafka > connected to 3rd Zookeeper Node goes down zookeeper cluster doesn't refresh > the session for Kafka connected to 3rd Node. > > So when other Node comes up and zookeeper cluster becomes available it > doesn't delete the id of the kafka which went down when zookeeper cluster > was down. > > Regarding TTL I have already enquired the kafka forum and awaiting reply. > > Ideally once zookeper cluster is up , it should delete the kafka broker > id's which are not connected which doesn't seem to be happening > > I hope I am making some sense :) > > Thanks > harish > > > > On Tue, Jun 12, 2018 at 2:59 PM Andor Molnár <an...@apache.org> wrote: > > > Hi Harish, > > > > > > I have a few questions to get some insight about your issue. > > > > 1. Why do run ZooKeeper with 6 nodes while odd number of nodes are > > recommended (not an issue really, just for curiousity), > > > > 2. Does Kafka support ZK 3.5+ with TTL nodes? > > > > I think this is more of a Kafka question, but afaik Kafka doesn't run and > > cannot take advantage of 3.5 only features of ZK. Maybe I'm wrong, but I > > think it has some cleanup mechanism to delete expired broker ids or you > > must wait for the session to expire. > > > > > > Regards, > > > > Andor > > > > > > > > On 06/12/2018 04:39 PM, harish lohar wrote: > > > > Hi All, > > > > Need help regarding below scenario if any configuration is available to > > help. > > > > I have cluster of 6 nodes > > 3 Nodes are stopped and brought up again, kafka fails to restart since > > broker ID are still present in zookeeper znode /broker/ids/ > > > > Since the cluster goes down after removing 3 Nodes , session timeout > > doesn't happen. > > > > Though i am aware about TTL feature in zookeeper , but how to make sure > > kafka creates znodes with TTL > > > > Thanks > > Harish > > > > > > > > > -- [image: Veeva Systems - Zinc Team] *Brian Lininger* Technical Architect, Infrastructure & Search *Veeva Systems * brian.linin...@veeva.com www.veeva.com *This email and the information it contains are intended for the intended recipient only, are confidential and may be privileged information exempt from disclosure by law.* *If you have received this email in error, please notify us immediately by reply email and delete this message from your computer.* *Please do not retain, copy or distribute this email.*