I have a single broker test Kafka instance that was running fine on Friday (basically out of the box configuration with 2 partitions), now I come back on Monday and producers are unable to send messages.
What else can i look at to debug, and prevent? I know how to recover by removing data directories for kafka and zookeeper to start fresh. But, this isn't the first time this has happened, so I would like to understand it better to feel more comfortable with kafka. =================== Producer error (from console produce) =================== [2014-08-11 19:32:49,781] WARN Error while fetching metadata [{TopicMetadata for topic mytopic -> No partition metadata for topic mytopic due to kafka.common.LeaderNotAvailableException}] for topic [mytopic]: class kafka.common.LeaderNotAvailableException (kafka.producer.BrokerPartitionInfo) [2014-08-11 19:32:49,782] ERROR Failed to collate messages by topic, partition due to: Failed to fetch topic metadata for topic: mytopic (kafka.producer.async.DefaultEventHandler) =============== state-change.log =============== [2014-08-11 19:12:45,312] TRACE Controller 0 epoch 3 started leader election for partition [mytopic,0] (state.change.logger) [2014-08-11 19:12:45,321] ERROR Controller 0 epoch 3 initiated state change for partition [mytopic,0] from OfflinePartition to OnlinePartition failed (state.change.logger) kafka.common.NoReplicaOnlineException: No replica for partition [mytopic,0] is alive. Live brokers are: [Set()], Assigned replicas are: [List(0)] at kafka.controller.OfflinePartitionLeaderSelector.selectLeader(PartitionLeaderSelector.scala:61) [2014-08-11 19:12:45,312] TRACE Controller 0 epoch 3 started leader election for partition [mytopic,1] (state.change.logger) [2014-08-11 19:12:45,321] ERROR Controller 0 epoch 3 initiated state change for partition [mytopic,1] from OfflinePartition to OnlinePartition failed (state.change.logger) kafka.common.NoReplicaOnlineException: No replica for partition [mytopic,1] is alive. Live brokers are: [Set()], Assigned replicas are: [List(0)] at kafka.controller.OfflinePartitionLeaderSelector.selectLeader(PartitionLeaderSelector.scala:61) =============== controller.log =============== [2014-08-11 19:12:45,308] DEBUG [OfflinePartitionLeaderSelector]: No broker in ISR is alive for [mytopic,1]. Pick the leader from the alive assigned replicas: (kafka.controller.OfflinePartitionLeaderSelector) [2014-08-11 19:12:45,321] DEBUG [OfflinePartitionLeaderSelector]: No broker in ISR is alive for [mytopic,0]. Pick the leader from the alive assigned replicas: (kafka.controller.OfflinePartitionLeaderSelector)