[
https://issues.apache.org/jira/browse/KAFKA-1460?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14093200#comment-14093200
]
Ryan Williams commented on KAFKA-1460:
--------------------------------------
I'm looking into this as well. Kafka was running fine on Friday, now I come
back on Monday and producers are unable to send messages. I posted to users
list as well, will see if that surfaces anything to look at.
===============
Producer error
===================
[2014-08-11 19:32:49,781] WARN Error while fetching metadata [{TopicMetadata
for topic mytopic ->
No partition metadata for topic mytopic due to
kafka.common.LeaderNotAvailableException}] for topic [mytopic]: class
kafka.common.LeaderNotAvailableException (kafka.producer.BrokerPartitionInfo)
[2014-08-11 19:32:49,782] ERROR Failed to collate messages by topic, partition
due to: Failed to fetch topic metadata for topic: mytopic
(kafka.producer.async.DefaultEventHandler)
===============
state-change.log
===============
[2014-08-11 19:12:45,312] TRACE Controller 0 epoch 3 started leader election
for partition [mytopic,0] (state.change.logger)
[2014-08-11 19:12:45,321] ERROR Controller 0 epoch 3 initiated state change for
partition [mytopic,0] from OfflinePartition to OnlinePartition failed
(state.change.logger)
kafka.common.NoReplicaOnlineException: No replica for partition [mytopic,0] is
alive. Live brokers are: [Set()], Assigned replicas are: [List(0)]
at
kafka.controller.OfflinePartitionLeaderSelector.selectLeader(PartitionLeaderSelector.scala:61)
[2014-08-11 19:12:45,312] TRACE Controller 0 epoch 3 started leader election
for partition [mytopic,1] (state.change.logger)
[2014-08-11 19:12:45,321] ERROR Controller 0 epoch 3 initiated state change for
partition [mytopic,1] from OfflinePartition to OnlinePartition failed
(state.change.logger)
kafka.common.NoReplicaOnlineException: No replica for partition [mytopic,1] is
alive. Live brokers are: [Set()], Assigned replicas are: [List(0)]
at
kafka.controller.OfflinePartitionLeaderSelector.selectLeader(PartitionLeaderSelector.scala:61)
===============
controller.log
===============
[2014-08-11 19:12:45,308] DEBUG [OfflinePartitionLeaderSelector]: No broker in
ISR is alive for [mytopic,1]. Pick the leader from the alive assigned replicas:
(kafka.controller.OfflinePartitionLeaderSelector)
[2014-08-11 19:12:45,321] DEBUG [OfflinePartitionLeaderSelector]: No broker in
ISR is alive for [mytopic,0]. Pick the leader from the alive assigned replicas:
(kafka.controller.OfflinePartitionLeaderSelector)
> NoReplicaOnlineException: No replica for partition
> --------------------------------------------------
>
> Key: KAFKA-1460
> URL: https://issues.apache.org/jira/browse/KAFKA-1460
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 0.8.1.1
> Reporter: Artur Denysenko
> Priority: Critical
> Attachments: state-change.log
>
>
> We have a standalone kafka server.
> After several days of running we get:
> {noformat}
> kafka.common.NoReplicaOnlineException: No replica for partition
> [gk.q.module,1] is alive. Live brokers are: [Set()], Assigned replicas are:
> [List(0)]
> at
> kafka.controller.OfflinePartitionLeaderSelector.selectLeader(PartitionLeaderSelector.scala:61)
> at
> kafka.controller.PartitionStateMachine.electLeaderForPartition(PartitionStateMachine.scala:336)
> at
> kafka.controller.PartitionStateMachine.kafka$controller$PartitionStateMachine$$handleStateChange(PartitionStateMachine.scala:185)
> at
> kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:99)
> at
> kafka.controller.PartitionStateMachine$$anonfun$triggerOnlinePartitionStateChange$3.apply(PartitionStateMachine.scala:96)
> at
> scala.collection.TraversableLike$WithFilter$$anonfun$foreach$1.apply(TraversableLike.scala:743)
> at
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:95)
> at
> scala.collection.mutable.HashMap$$anonfun$foreach$1.apply(HashMap.scala:95)
> at scala.collection.Iterator$class.foreach(Iterator.scala:772)
> at
> scala.collection.mutable.HashTable$$anon$1.foreach(HashTable.scala:157)
> at
> scala.collection.mutable.HashTable$class.foreachEntry(HashTable.scala:190)
> at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:45)
> at scala.collection.mutable.HashMap.foreach(HashMap.scala:95)
> at
> scala.collection.TraversableLike$WithFilter.foreach(TraversableLike.scala:742)
> at
> kafka.controller.PartitionStateMachine.triggerOnlinePartitionStateChange(PartitionStateMachine.scala:96)
> at
> kafka.controller.PartitionStateMachine.startup(PartitionStateMachine.scala:68)
> at
> kafka.controller.KafkaController.onControllerFailover(KafkaController.scala:312)
> at
> kafka.controller.KafkaController$$anonfun$1.apply$mcV$sp(KafkaController.scala:162)
> at
> kafka.server.ZookeeperLeaderElector.elect(ZookeeperLeaderElector.scala:63)
> at
> kafka.controller.KafkaController$SessionExpirationListener$$anonfun$handleNewSession$1.apply$mcZ$sp(KafkaController.scala:1068)
> at
> kafka.controller.KafkaController$SessionExpirationListener$$anonfun$handleNewSession$1.apply(KafkaController.scala:1066)
> at
> kafka.controller.KafkaController$SessionExpirationListener$$anonfun$handleNewSession$1.apply(KafkaController.scala:1066)
> at kafka.utils.Utils$.inLock(Utils.scala:538)
> at
> kafka.controller.KafkaController$SessionExpirationListener.handleNewSession(KafkaController.scala:1066)
> at org.I0Itec.zkclient.ZkClient$4.run(ZkClient.java:472)
> at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)
> {noformat}
> Please see attached [state-change.log]
> You can find all server logs (450mb) here:
> http://46.4.114.35:9999/deploy/kafka-logs.2014-05-14-16.tgz
> On client we get:
> {noformat}
> 16:28:36,843 [ool-12-thread-2] WARN ZookeeperConsumerConnector -
> [dev_dev-1400257716132-e7b8240c], no brokers found when trying to rebalance.
> {noformat}
> If we try to send message using 'kafka-console-producer.sh':
> {noformat}
> [root@dev kafka]# /srv/kafka/bin/kafka-console-producer.sh --broker-list
> localhost:9092 --topic test
> message
> SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
> SLF4J: Defaulting to no-operation (NOP) logger implementation
> SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further
> details.
> [2014-05-16 19:45:30,950] WARN Fetching topic metadata with correlation id 0
> for topics [Set(test)] from broker [id:0,host:localhost,port:9092] failed
> (kafka.client.ClientUtils$)
> java.net.SocketTimeoutException
> at
> sun.nio.ch.SocketAdaptor$SocketInputStream.read(SocketAdaptor.java:229)
> at sun.nio.ch.ChannelInputStream.read(ChannelInputStream.java:103)
> at
> java.nio.channels.Channels$ReadableByteChannelImpl.read(Channels.java:385)
> at kafka.utils.Utils$.read(Utils.scala:375)
> at
> kafka.network.BoundedByteBufferReceive.readFrom(BoundedByteBufferReceive.scala:54)
> at kafka.network.Receive$class.readCompletely(Transmission.scala:56)
> at
> kafka.network.BoundedByteBufferReceive.readCompletely(BoundedByteBufferReceive.scala:29)
> at kafka.network.BlockingChannel.receive(BlockingChannel.scala:100)
> at kafka.producer.SyncProducer.liftedTree1$1(SyncProducer.scala:74)
> at
> kafka.producer.SyncProducer.kafka$producer$SyncProducer$$doSend(SyncProducer.scala:71)
> at kafka.producer.SyncProducer.send(SyncProducer.scala:112)
> at kafka.client.ClientUtils$.fetchTopicMetadata(ClientUtils.scala:53)
> at
> kafka.producer.BrokerPartitionInfo.updateInfo(BrokerPartitionInfo.scala:82)
> at
> kafka.producer.async.DefaultEventHandler$$anonfun$handle$1.apply$mcV$sp(DefaultEventHandler.scala:67)
> at kafka.utils.Utils$.swallow(Utils.scala:167)
> at kafka.utils.Logging$class.swallowError(Logging.scala:106)
> at kafka.utils.Utils$.swallowError(Utils.scala:46)
> at
> kafka.producer.async.DefaultEventHandler.handle(DefaultEventHandler.scala:67)
> at
> kafka.producer.async.ProducerSendThread.tryToHandle(ProducerSendThread.scala:104)
> at
> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:87)
> at
> kafka.producer.async.ProducerSendThread$$anonfun$processEvents$3.apply(ProducerSendThread.scala:67)
> at scala.collection.immutable.Stream.foreach(Stream.scala:526)
> at
> kafka.producer.async.ProducerSendThread.processEvents(ProducerSendThread.scala:66)
> at
> kafka.producer.async.ProducerSendThread.run(ProducerSendThread.scala:44)
> {noformat}
> If we try to receive message using 'kafka-console-consumer.sh':
> {noformat}
> [root@dev kafka]# /srv/kafka/bin/kafka-console-consumer.sh --zookeeper
> localhost:2181 --topic test
> SLF4J: Failed to load class "org.slf4j.impl.StaticLoggerBinder".
> SLF4J: Defaulting to no-operation (NOP) logger implementation
> SLF4J: See http://www.slf4j.org/codes.html#StaticLoggerBinder for further
> details.
> [2014-05-16 19:46:23,029] WARN
> [console-consumer-69449_dev-1400262382648-1c9bfcd3], no brokers found when
> trying to rebalance. (kafka.consumer.ZookeeperConsumerConnector)
> {noformat}
> Port 9092 is open:
> {noformat}
> [root@dev kafka]# telnet localhost 9092
> Trying 127.0.0.1...
> Connected to localhost.
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.2#6252)