I have a basic Zookeeper/Kafka setup. I am still on Kafka 0.8 beta 1, and Zookeeper 3.4.5. The activity on this machine isn't massive...I would say the Kafka queues get a consistent 1 message every 2-3 seconds, as well as occasional spikes, but still nothing large enough to push the limits. Both Kafka and Zookeeper are running on the same machine.
Occasionally, a rebalance is triggered, which causes our Kafka clients to try reconnecting several times, but it ultimately fails with the following error: 04:56:10,020 INFO [kafka.consumer.ZookeeperConsumerConnector] (alarms.topology.updates_<host>-1383643783747-c7775701_watcher_executor) [alarms.topology.updates_<host>-1383643783747-c7775701], exception during rebalance : org.I0Itec.zkclient.exception.ZkNoNodeException: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /consumers/alarms.topology.updates/ids/alarms.topology.updates_<host>-1383643783747-c7775701 at org.I0Itec.zkclient.exception.ZkException.create(ZkException.java:47) [zkclient-0.3.jar:0.3] at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:685) [zkclient-0.3.jar:0.3] at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:766) [zkclient-0.3.jar:0.3] at org.I0Itec.zkclient.ZkClient.readData(ZkClient.java:761) [zkclient-0.3.jar:0.3] at kafka.utils.ZkUtils$.readData(ZkUtils.scala:407) [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT] at kafka.consumer.TopicCount$.constructTopicCount(TopicCount.scala:52) [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT] at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.kafka$consumer$ZookeeperConsumerConnector$ZKRebalancerListener$$rebalance(ZookeeperConsumerConnector.scala:401) [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT] at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anonfun$syncedRebalance$1.apply$mcVI$sp(ZookeeperConsumerConnector.scala:374) [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT] at scala.collection.immutable.Range.foreach$mVc$sp(Range.scala:78) [scala-library-2.9.2.jar:] at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener.syncedRebalance(ZookeeperConsumerConnector.scala:369) [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT] at kafka.consumer.ZookeeperConsumerConnector$ZKRebalancerListener$$anon$1.run(ZookeeperConsumerConnector.scala:326) [kafka_2.9.2-0.8.0-SNAPSHOT.jar:0.8.0-SNAPSHOT] Caused by: org.apache.zookeeper.KeeperException$NoNodeException: KeeperErrorCode = NoNode for /consumers/alarms.topology.updates/ids/alarms.topology.updates_<host>-1383643783747-c7775701 at org.apache.zookeeper.KeeperException.create(KeeperException.java:111) [zookeeper-3.4.3.jar:3.4.3-1240972] at org.apache.zookeeper.KeeperException.create(KeeperException.java:51) [zookeeper-3.4.3.jar:3.4.3-1240972] at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1131) [zookeeper-3.4.3.jar:3.4.3-1240972] at org.apache.zookeeper.ZooKeeper.getData(ZooKeeper.java:1160) [zookeeper-3.4.3.jar:3.4.3-1240972] at org.I0Itec.zkclient.ZkConnection.readData(ZkConnection.java:103) [zkclient-0.3.jar:0.3] at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:770) [zkclient-0.3.jar:0.3] at org.I0Itec.zkclient.ZkClient$9.call(ZkClient.java:766) [zkclient-0.3.jar:0.3] at org.I0Itec.zkclient.ZkClient.retryUntilConnected(ZkClient.java:675) [zkclient-0.3.jar:0.3] ... 9 more Our Kafka consumers are written in Clojure ( https://github.com/pingles/clj-kafka). Any ideas on what can cause such behaviour? The rebalances themselves happen sporadically, but when they do, they sometimes fail and an error like the one above is shown. I'm not sure if this is a Kafka or Zookeeper problem at this point, but any help would be appreciated. Thanks