Re: Exception causing Kafka to crash

Jay Kreps Fri, 07 Oct 2011 09:21:49 -0700

It occurs to me that we could do a better job with this error. There are
really three things that might have happened (1) you restarted kafka within
the zk timeout, in which case as far as zk is concerned your old broker
still exists...this is weird but actually correct behavior, (2) you have two
brokers with the same id, (3) zk has a bug and is not deleting ephemeral
nodes.


I think if we just improved the error message to explain this we would have
happier users, as is it requires slightly deep knowledge of zk to understand
why this happens.

-Jay

On Fri, Oct 7, 2011 at 7:35 AM, Mathias Herberts <mathias.herbe...@gmail.com
> wrote:

> If you abort Kafka (killing the JVM for example) and restart it,
> depending on the zookeeper timeout you've used, it might occur that
> the ephemeral node create by the broker has not yet been removed by
> ZK.
>
> If this is the case, Kafka will detect that there is a znode conflict
> and kill itself.
>
> This is what your logs seem to imply:
>
> [2011-10-03 15:33:22,229] INFO conflict in /brokers/ids/0 data:
> 10.98.20.109-1317681202194:10.98.20.109:9092 stored data:
> 10.98.20.109-1317268078266:10.98.20.109:9092 (kafka.utils.ZkUtils$)
>
> Try to either wait for more than the ZK timeout prior to restarting
> Kafka, or lower the ZK timeout so the ephemeral node is indeed gone
> when you restart Kafka.
>
> Mathias.
>

Re: Exception causing Kafka to crash

Reply via email to