Good point, Jay! I filed a JIRA to address this - https://issues.apache.org/jira/browse/KAFKA-150
Thanks, Neha On Fri, Oct 7, 2011 at 9:21 AM, Jay Kreps <jay.kr...@gmail.com> wrote: > It occurs to me that we could do a better job with this error. There are > really three things that might have happened (1) you restarted kafka within > the zk timeout, in which case as far as zk is concerned your old broker > still exists...this is weird but actually correct behavior, (2) you have two > brokers with the same id, (3) zk has a bug and is not deleting ephemeral > nodes. > > I think if we just improved the error message to explain this we would have > happier users, as is it requires slightly deep knowledge of zk to understand > why this happens. > > -Jay > > On Fri, Oct 7, 2011 at 7:35 AM, Mathias Herberts <mathias.herbe...@gmail.com >> wrote: > >> If you abort Kafka (killing the JVM for example) and restart it, >> depending on the zookeeper timeout you've used, it might occur that >> the ephemeral node create by the broker has not yet been removed by >> ZK. >> >> If this is the case, Kafka will detect that there is a znode conflict >> and kill itself. >> >> This is what your logs seem to imply: >> >> [2011-10-03 15:33:22,229] INFO conflict in /brokers/ids/0 data: >> 10.98.20.109-1317681202194:10.98.20.109:9092 stored data: >> 10.98.20.109-1317268078266:10.98.20.109:9092 (kafka.utils.ZkUtils$) >> >> Try to either wait for more than the ZK timeout prior to restarting >> Kafka, or lower the ZK timeout so the ephemeral node is indeed gone >> when you restart Kafka. >> >> Mathias. >> >