Johnny Brown created KAFKA-2300:
-----------------------------------
Summary: Error in controller log when broker tries to rejoin
cluster
Key: KAFKA-2300
URL: https://issues.apache.org/jira/browse/KAFKA-2300
Project: Kafka
Issue Type: Bug
Affects Versions: 0.8.2.1
Reporter: Johnny Brown
Hello Kafka folks,
We are having an issue where a broker attempts to join the cluster after being
restarted, but is never added to the ISR for its assigned partitions. This is a
three-node cluster, and the controller is broker 2.
When broker 1 starts, we see the following message in broker 2's controller.log.
{{
[2015-06-23 13:57:16,535] ERROR [BrokerChangeListener on Controller 2]: Error
while handling broker changes
(kafka.controller.ReplicaStateMachine$BrokerChangeListener)
java.lang.IllegalStateException: Controller to broker state change requests
batch is not empty while creating a new one. Some UpdateMetadata state changes
Map(2 -> Map([prod-sver-end,1] ->
(LeaderAndIsrInfo:(Leader:-2,ISR:1,LeaderEpoch:0,ControllerEpoch:165),ReplicationFactor:1),AllReplicas:1)),
1 -> Map([prod-sver-end,1] ->
(LeaderAndIsrInfo:(Leader:-2,ISR:1,LeaderEpoch:0,ControllerEpoch:165),ReplicationFactor:1),AllReplicas:1)),
3 -> Map([prod-sver-end,1] ->
(LeaderAndIsrInfo:(Leader:-2,ISR:1,LeaderEpoch:0,ControllerEpoch:165),ReplicationFactor:1),AllReplicas:1)))
might be lost
at
kafka.controller.ControllerBrokerRequestBatch.newBatch(ControllerChannelManager.scala:202)
at
kafka.controller.KafkaController.sendUpdateMetadataRequest(KafkaController.scala:974)
at kafka.controller.KafkaController.onBrokerStartup(KafkaController.scala:399)
at
kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply$mcV$sp(ReplicaStateMachine.scala:371)
at
kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:359)
at
kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1$$anonfun$apply$mcV$sp$1.apply(ReplicaStateMachine.scala:359)
at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:33)
at
kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply$mcV$sp(ReplicaStateMachine.scala:358)
at
kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:357)
at
kafka.controller.ReplicaStateMachine$BrokerChangeListener$$anonfun$handleChildChange$1.apply(ReplicaStateMachine.scala:357)
at kafka.utils.Utils$.inLock(Utils.scala:535)
at
kafka.controller.ReplicaStateMachine$BrokerChangeListener.handleChildChange(ReplicaStateMachine.scala:356)
at org.I0Itec.zkclient.ZkClient$7.run(ZkClient.java:568)
at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)
}}
{{prod-sver-end}} is a topic we previously deleted. It seems some remnant of it
persists in the controller's memory, causing an exception which interrupts the
state change triggered by the broker startup.
Has anyone seen something like this? Any idea what's happening here? Any
information would be greatly appreciated.
Thanks,
Johnny
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)