José Armando García Sancio created KAFKA-19343:
--------------------------------------------------
Summary: Misconfigured broker listeners causes unrecoverable
controller error
Key: KAFKA-19343
URL: https://issues.apache.org/jira/browse/KAFKA-19343
Project: Kafka
Issue Type: Bug
Components: controller
Affects Versions: 3.9.0
Reporter: José Armando García Sancio
With a missed configured broker it is possible for the controller to throw this
exception.
{code:java}
[2025-05-07 08:47:50,245] ERROR Haven't been able to send leader and isr
requests, current state of the map is HashMap(1 -> LeaderAndIsrBatch(version=6,
brokerId=1, brokerEpoch=111669150272, controllerId=6, controllerEpoch=201,
containsAllReplicas=false, numParti tions=334, numTopicIds=159,
numLiveLeaders=2), 3 -> LeaderAndIsrBatch(version=6, brokerId=3, brokerEpoch=0,
controllerId=0, controllerEpoch=0, containsAllReplicas=false,
numPartitions=302, numTopicIds=178, numLiveLeaders=0), 4 ->
LeaderAndIsrBatch(version=6, brok erId=4, brokerEpoch=0, controllerId=0,
controllerEpoch=0, containsAllReplicas=false, numPartitions=228,
numTopicIds=163, numLiveLeaders=0), 5 -> LeaderAndIsrBatch(version=6,
brokerId=5, brokerEpoch=0, controllerId=0, controllerEpoch=0,
containsAllReplicas=false, numPartitions=33, numTopicIds=19,
numLiveLeaders=0), 6 -> LeaderAndIsrBatch(version=6, brokerId=6, brokerEpoch=0,
controllerId=0, controllerEpoch=0, containsAllReplicas=false,
numPartitions=261, numTopicIds=149, numLiveLeaders=0)). Exception message:
kafka.common.BrokerEndPointNotAvailableException: End point with listener name
INTERNAL_SCRAM not found for broker 1
(kafka.controller.ControllerBrokerRequestBatch)
[2025-05-07 08:47:50,245] ERROR Haven't been able to send metadata update
requests, current state of the map is HashMap(1 ->
UpdateMetadataBatch(version=7, brokerId=1, brokerEpoch=0, controllerId=0,
controllerEpoch=0, hasNewBrokers=false, numPartitions=549, numTo picIds=203,
numLiveBrokers=0), 3 -> UpdateMetadataBatch(version=7, brokerId=3,
brokerEpoch=0, controllerId=0, controllerEpoch=0, hasNewBrokers=false,
numPartitions=549, numTopicIds=203, numLiveBrokers=0), 4 ->
UpdateMetadataBatch(version=7, brokerId=4, brokerEpoc h=0, controllerId=0,
controllerEpoch=0, hasNewBrokers=false, numPartitions=549, numTopicIds=203,
numLiveBrokers=0), 5 -> UpdateMetadataBatch(version=7, brokerId=5,
brokerEpoch=0, controllerId=0, controllerEpoch=0, hasNewBrokers=false,
numPartitions=549, numTopicI ds=203, numLiveBrokers=0), 6 ->
UpdateMetadataBatch(version=7, brokerId=6, brokerEpoch=0, controllerId=0,
controllerEpoch=0, hasNewBrokers=false, numPartitions=549, numTopicIds=203,
numLiveBrokers=0)). Exception message:
kafka.common.BrokerEndPointNotAvailableException: End point with listener name
INTERNAL_SCRAM not found for broker 1
(kafka.controller.ControllerBrokerRequestBatch)
[2025-05-07 08:47:50,246] ERROR Haven't been able to send stop replica
requests, current state of the map is HashMap(2 -> StopReplicaBatch(version=3,
brokerId=2, brokerEpoch=0, controllerId=0, controllerEpoch=0,
numPartitions=549, numTopicIds=203)). Exception message:
kafka.common.BrokerEndPointNotAvailableException: End point with listener name
INTERNAL_SCRAM not found for broker 1
(kafka.controller.ControllerBrokerRequestBatch)
[2025-05-07 08:47:50,247] ERROR [ReplicaStateMachine controllerId=6] Error
while moving some replicas to OfflineReplica state
(kafka.controller.ZkReplicaStateMachine)
java.lang.IllegalStateException:
kafka.common.BrokerEndPointNotAvailableException: End point with listener name
INTERNAL_SCRAM not found for broker 1
at
kafka.controller.AbstractControllerBrokerRequestBatch.sendRequestsToBrokers(ControllerChannelManager.scala:708)
at
kafka.controller.ZkReplicaStateMachine.handleStateChanges(ReplicaStateMachine.scala:120)
at
kafka.controller.KafkaController.onReplicasBecomeOffline(KafkaController.scala:770)
at kafka.controller.KafkaController.onBrokerFailure(KafkaController.scala:734)
at
kafka.controller.KafkaController.processBrokerChange(KafkaController.scala:1906)
at kafka.controller.KafkaController.process(KafkaController.scala:3426)
at kafka.controller.QueuedEvent.process(ControllerEventManager.scala:52)
at
kafka.controller.ControllerEventManager$ControllerEventThread.process$1(ControllerEventManager.scala:130)
at
kafka.controller.ControllerEventManager$ControllerEventThread.$anonfun$doWork$1(ControllerEventManager.scala:133)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.scala:18)
at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:31)
at
kafka.controller.ControllerEventManager$ControllerEventThread.doWork(ControllerEventManager.scala:133)
at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:99)
Caused by: kafka.common.BrokerEndPointNotAvailableException: End point with
listener name INTERNAL_SCRAM not found for broker 1
at kafka.cluster.Broker.$anonfun$node$1(Broker.scala:96)
at scala.Option.getOrElse(Option.scala:201)
at kafka.cluster.Broker.node(Broker.scala:95)
at
kafka.controller.AbstractControllerBrokerRequestBatch.$anonfun$sendLeaderAndIsrRequest$3(ControllerChannelManager.scala:599)
at scala.collection.mutable.HashSet$Node.foreach(HashSet.scala:435)
at scala.collection.mutable.HashSet.foreach(HashSet.scala:361)
at
kafka.controller.AbstractControllerBrokerRequestBatch.$anonfun$sendLeaderAndIsrRequest$1(ControllerChannelManager.scala:597)
at
kafka.controller.AbstractControllerBrokerRequestBatch.$anonfun$sendLeaderAndIsrRequest$1$adapted(ControllerChannelManager.scala:588)
at
kafka.utils.Implicits$MapExtensionMethods$.$anonfun$forKeyValue$1(Implicits.scala:62)
at scala.collection.mutable.HashMap$Node.foreachEntry(HashMap.scala:633)
at scala.collection.mutable.HashMap.foreachEntry(HashMap.scala:499)
at
kafka.controller.AbstractControllerBrokerRequestBatch.sendLeaderAndIsrRequest(ControllerChannelManager.scala:588)
at
kafka.controller.AbstractControllerBrokerRequestBatch.sendRequestsToBrokers(ControllerChannelManager.scala:691)
... 12 more
{code}
It looks like the controller doesn’t handle this error correctly. If the broker
configuration is changed to be correct, the controller cannot recover from the
error.
The work around is to force a controller failover.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)