[ https://issues.apache.org/jira/browse/KAFKA-19318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
dennis lucero updated KAFKA-19318: ---------------------------------- Description: Despite several attempts to migrate from Zookeeper cluster to Kraft, it failed to properly migrate. We spawn a need cluster fully healthy with 18 Kafka nodes connected to 5 Zookeeper nodes. 9 new Kafka nodes are there for the new controllers. It was tested with Kafka 3.9.0 The controllers are started with: {code:java} ==> /var/log/kafka/kafka-server.log <== [2025-05-20 15:04:28,798] INFO Registered kafka:type=kafka.Log4jController MBean (kafka.utils.Log4jControllerRegistration$) [2025-05-20 15:04:29,207] INFO [ControllerServer id=111] Starting controller (kafka.server.ControllerServer) [2025-05-20 15:04:29,656] INFO Updated connection-accept-rate max connection creation rate to 2147483647 (kafka.network.ConnectionQuotas) [2025-05-20 15:04:29,784] INFO [SocketServer listenerType=CONTROLLER, nodeId=111] Created data-plane acceptor and processors for endpoint : ListenerName(CONTROLLER) (kafka.network.SocketServer) [2025-05-20 15:04:29,792] INFO [SharedServer id=111] Starting SharedServer (kafka.server.SharedServer) [2025-05-20 15:04:29,841] INFO [LogLoader partition=__cluster_metadata-0, dir=/data/kafka1/kafka] Recovering unflushed segment 0. 0/1 recovered for __cluster_metadata-0. (kafka.log.LogLoader) [2025-05-20 15:04:29,842] INFO [LogLoader partition=__cluster_metadata-0, dir=/data/kafka1/kafka] Loading producer state till offset 0 with message format version 2 (kafka.log.UnifiedLog$) [2025-05-20 15:04:29,843] INFO [LogLoader partition=__cluster_metadata-0, dir=/data/kafka1/kafka] Reloading from producer snapshot and rebuilding producer state from offset 0 (kafka.log.UnifiedLog$) [2025-05-20 15:04:29,843] INFO [LogLoader partition=__cluster_metadata-0, dir=/data/kafka1/kafka] Producer state recovery took 0ms for snapshot load and 0ms for segment recovery from offset 0 (kafka.log.UnifiedLog$) [2025-05-20 15:04:29,905] INFO [LogLoader partition=__cluster_metadata-0, dir=/data/kafka1/kafka] Loading producer state till offset 2709 with message format version 2 (kafka.log.UnifiedLog$) [2025-05-20 15:04:29,905] INFO [LogLoader partition=__cluster_metadata-0, dir=/data/kafka1/kafka] Reloading from producer snapshot and rebuilding producer state from offset 2709 (kafka.log.UnifiedLog$) [2025-05-20 15:04:29,906] INFO [LogLoader partition=__cluster_metadata-0, dir=/data/kafka1/kafka] Producer state recovery took 1ms for snapshot load and 0ms for segment recovery from offset 2709 (kafka.log.UnifiedLog$) [2025-05-20 15:04:29,940] INFO Initialized snapshots with IDs SortedSet(OffsetAndEpoch(offset=0, epoch=0)) from /data/kafka1/kafka/__cluster_metadata-0 (kafka.raft.KafkaMetadataLog$) [2025-05-20 15:04:29,952] INFO [raft-expiration-reaper]: Starting (kafka.raft.TimingWheelExpirationService$ExpiredOperationReaper) [2025-05-20 15:04:30,191] INFO [ControllerServer id=111] Waiting for controller quorum voters future (kafka.server.ControllerServer) [2025-05-20 15:04:30,191] INFO [ControllerServer id=111] Finished waiting for controller quorum voters future (kafka.server.ControllerServer) [2025-05-20 15:04:30,233] INFO [ZooKeeperClient KRaft Migration] Initializing a new session to prod-zk-11.ae-rus.net:2181,prod-zk-12.ae-rus.net:2181,prod-zk-13.ae-rus.net:2181,prod-zk-14.ae-rus.net:2181,prod-zk-15.ae-rus.net:2181. (kafka.zookeeper.ZooKeeperClient) [2025-05-20 15:04:30,248] INFO [ZooKeeperClient KRaft Migration] Waiting until connected. (kafka.zookeeper.ZooKeeperClient) [2025-05-20 15:04:30,261] INFO [ZooKeeperClient KRaft Migration] Connected. (kafka.zookeeper.ZooKeeperClient) [2025-05-20 15:04:30,293] INFO [controller-111-ThrottledChannelReaper-Fetch]: Starting (kafka.server.ClientQuotaManager$ThrottledChannelReaper) [2025-05-20 15:04:30,293] INFO [controller-111-ThrottledChannelReaper-Produce]: Starting (kafka.server.ClientQuotaManager$ThrottledChannelReaper) [2025-05-20 15:04:30,294] INFO [controller-111-ThrottledChannelReaper-Request]: Starting (kafka.server.ClientQuotaManager$ThrottledChannelReaper) [2025-05-20 15:04:30,297] INFO [controller-111-ThrottledChannelReaper-ControllerMutation]: Starting (kafka.server.ClientQuotaManager$ThrottledChannelReaper) [2025-05-20 15:04:30,316] INFO [ExpirationReaper-111-AlterAcls]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) [2025-05-20 15:04:30,392] INFO [ControllerServer id=111] Waiting for the controller metadata publishers to be installed (kafka.server.ControllerServer) [2025-05-20 15:04:30,393] INFO [ControllerServer id=111] Finished waiting for the controller metadata publishers to be installed (kafka.server.ControllerServer) [2025-05-20 15:04:30,393] INFO [SocketServer listenerType=CONTROLLER, nodeId=111] Enabling request processing. (kafka.network.SocketServer) [2025-05-20 15:04:30,394] INFO [ControllerRegistrationManager id=111 incarnation=mbreItYzTmySXeahJkXxFA] Found registration for UwcgQ_r6T_uBVDMCdWLG6A instead of our incarnation. (kafka.server.ControllerRegistrationManager) [2025-05-20 15:04:30,395] INFO Awaiting socket connections on 0.0.0.0:9091. (kafka.network.DataPlaneAcceptor) [2025-05-20 15:04:30,455] INFO [controller-111-to-controller-registration-channel-manager]: Starting (kafka.server.NodeToControllerRequestThread) [2025-05-20 15:04:30,455] INFO [ControllerRegistrationManager id=111 incarnation=mbreItYzTmySXeahJkXxFA] initialized channel manager. (kafka.server.ControllerRegistrationManager) [2025-05-20 15:04:30,455] INFO [ControllerServer id=111] Waiting for all of the authorizer futures to be completed (kafka.server.ControllerServer) [2025-05-20 15:04:30,456] INFO [ControllerServer id=111] Finished waiting for all of the authorizer futures to be completed (kafka.server.ControllerServer) [2025-05-20 15:04:30,456] INFO [ControllerServer id=111] Waiting for all of the SocketServer Acceptors to be started (kafka.server.ControllerServer) [2025-05-20 15:04:30,456] INFO [ControllerServer id=111] Finished waiting for all of the SocketServer Acceptors to be started (kafka.server.ControllerServer) [2025-05-20 15:04:30,457] INFO [ControllerRegistrationManager id=111 incarnation=mbreItYzTmySXeahJkXxFA] sendControllerRegistration: attempting to send ControllerRegistrationRequestData(controllerId=111, incarnationId=mbreItYzTmySXeahJkXxFA, zkMigrationReady=true, listeners=[Listener(name='CONTROLLER', host='prodkafkacontr11z1.h.ae-rus.net', port=9091, securityProtocol=0)], features=[Feature(name='kraft.version', minSupportedVersion=0, maxSupportedVersion=1), Feature(name='metadata.version', minSupportedVersion=1, maxSupportedVersion=21)]) (kafka.server.ControllerRegistrationManager) [2025-05-20 15:04:30,471] INFO [controller-111-to-controller-registration-channel-manager]: Recorded new KRaft controller, from now on will use node prodkafkacontr11z1.h.ae-rus.net:9091 (id: 111 rack: null) (kafka.server.NodeToControllerRequestThread) [2025-05-20 15:04:30,505] INFO [KafkaRaftServer nodeId=111] Kafka Server started (kafka.server.KafkaRaftServer) [2025-05-20 15:04:30,579] INFO [ControllerRegistrationManager id=111 incarnation=mbreItYzTmySXeahJkXxFA] Our registration has been persisted to the metadata log. (kafka.server.ControllerRegistrationManager) [2025-05-20 15:04:30,581] INFO [ControllerRegistrationManager id=111 incarnation=mbreItYzTmySXeahJkXxFA] RegistrationResponseHandler: controller acknowledged ControllerRegistrationRequest. (kafka.server.ControllerRegistrationManager){code} {code:java} cat /var/log/kafka/controller.log [2025-05-20 15:04:30,357] TRACE [KRaftMigrationDriver id=111] Recovered migration state from ZK in 68765354 ns. Transitioned migration state from ZkMigrationLeadershipState{kraftControllerId=-1, kraftControllerEpoch=-1, kraftMetadataOffset=-1, kraftMetadataEpoch=-1, lastUpdatedTimeMs=-1, migrationZkVersion=-2, controllerZkEpoch=-1, controllerZkVersion=-2} to ZkMigrationLeadershipState{kraftControllerId=-1, kraftControllerEpoch=-1, kraftMetadataOffset=-1, kraftMetadataEpoch=-1, lastUpdatedTimeMs=1747648116174, migrationZkVersion=0, controllerZkEpoch=-1, controllerZkVersion=-2} (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,358] INFO [KRaftMigrationDriver id=111] Initial migration of ZK metadata is not done. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,358] INFO [KRaftMigrationDriver id=111] 111 transitioning from UNINITIALIZED to INACTIVE state (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,388] INFO [KRaftMigrationDriver id=111] Became active migration driver in 27621312 ns. Transitioned migration state from ZkMigrationLeadershipState{kraftControllerId=-1, kraftControllerEpoch=-1, kraftMetadataOffset=-1, kraftMetadataEpoch=-1, lastUpdatedTimeMs=1747648116174, migrationZkVersion=0, controllerZkEpoch=-1, controllerZkVersion=-2} to ZkMigrationLeadershipState{kraftControllerId=111, kraftControllerEpoch=2, kraftMetadataOffset=-1, kraftMetadataEpoch=-1, lastUpdatedTimeMs=1747648116174, migrationZkVersion=0, controllerZkEpoch=-1, controllerZkVersion=-2} (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,388] INFO [KRaftMigrationDriver id=111] 111 transitioning from INACTIVE to WAIT_FOR_CONTROLLER_QUORUM state (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,388] INFO [KRaftMigrationDriver id=111] Controller Quorum is ready for Zk to KRaft migration. Now waiting for ZK brokers. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,388] INFO [KRaftMigrationDriver id=111] 111 transitioning from WAIT_FOR_CONTROLLER_QUORUM to WAIT_FOR_BROKERS state (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,389] INFO [KRaftMigrationDriver id=111] No brokers are known to KRaft, waiting for brokers to register. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,494] INFO [KRaftMigrationDriver id=111] No brokers are known to KRaft, waiting for brokers to register. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,579] TRACE [KRaftMigrationDriver id=111] Received metadata delta, but the controller is not in dual-write mode. Ignoring this metadata update. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,580] INFO [KRaftMigrationDriver id=111] No brokers are known to KRaft, waiting for brokers to register. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,645] TRACE [KRaftMigrationDriver id=111] Received metadata delta, but the controller is not in dual-write mode. Ignoring this metadata update. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,645] INFO [KRaftMigrationDriver id=111] No brokers are known to KRaft, waiting for brokers to register. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,701] TRACE [KRaftMigrationDriver id=111] Received metadata delta, but the controller is not in dual-write mode. Ignoring this metadata update. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,701] INFO [KRaftMigrationDriver id=111] No brokers are known to KRaft, waiting for brokers to register. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,730] TRACE [KRaftMigrationDriver id=111] Received metadata delta, but the controller is not in dual-write mode. Ignoring this metadata update. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,730] INFO [KRaftMigrationDriver id=111] No brokers are known to KRaft, waiting for brokers to register. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,778] TRACE [KRaftMigrationDriver id=111] Received metadata delta, but the controller is not in dual-write mode. Ignoring this metadata update. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,861] TRACE [KRaftMigrationDriver id=111] Received metadata delta, but the controller is not in dual-write mode. Ignoring this metadata update. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,861] INFO [KRaftMigrationDriver id=111] No brokers are known to KRaft, waiting for brokers to register. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,905] TRACE [KRaftMigrationDriver id=111] Received metadata delta, but the controller is not in dual-write mode. Ignoring this metadata update. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,905] INFO [KRaftMigrationDriver id=111] No brokers are known to KRaft, waiting for brokers to register. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:31,202] TRACE [KRaftMigrationDriver id=111] Received metadata delta, but the controller is not in dual-write mode. Ignoring this metadata update. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:31,202] INFO [KRaftMigrationDriver id=111] No brokers are known to KRaft, waiting for brokers to register. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:31,278] TRACE [KRaftMigrationDriver id=111] Received metadata delta, but the controller is not in dual-write mode. Ignoring this metadata update. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:31,778] TRACE [KRaftMigrationDriver id=111] Received metadata delta, but the controller is not in dual-write mode. Ignoring this metadata update. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) {code} The config on the controller node is the following {code:java} cat /etc/kafka/server.properties ## # Ansible managed #host.name=prodkafkacontr14z4.h.ae-rus.netcluster.name=MrgOne node.id=114 controller.rack=z4replica.selector.class=org.apache.kafka.common.replica.RackAwareReplicaSelectorprocess.roles=controllerdelete.topic.enable=True auto.create.topics.enable=Truelistener.security.protocol.map=INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT,CONTROLLER:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL listeners=CONTROLLER://:9091advertised.listeners=CONTROLLER://prodkafkacontr14z4.h.ae-rus.net:9091 controller.listener.names=CONTROLLER inter.broker.listener.name=EXTERNALcontroller.quorum.voters=1...@prodkafkacontr11z1.h.ae-rus.net:9091,1...@prodkafkacontr12z1.h.ae-rus.net:9091,1...@prodkafkacontr13z1.h.ae-rus.net:9091,1...@prodkafkacontr17z3.h.ae-rus.net:9091,1...@prodkafkacontr18z3.h.ae-rus.net:9091,1...@prodkafkacontr19z3.h.ae-rus.net:9091,1...@prodkafkacontr14z4.h.ae-rus.net:9091,1...@prodkafkacontr15z4.h.ae-rus.net:9091,1...@prodkafkacontr16z4.h.ae-rus.net:9091controller.quorum.bootstrap.servers=prodkafkacontr11z1.h.ae-rus.net:9091,prodkafkacontr12z1.h.ae-rus.net:9091,prodkafkacontr13z1.h.ae-rus.net:9091,prodkafkacontr17z3.h.ae-rus.net:9091,prodkafkacontr18z3.h.ae-rus.net:9091,prodkafkacontr19z3.h.ae-rus.net:9091,prodkafkacontr14z4.h.ae-rus.net:9091,prodkafkacontr15z4.h.ae-rus.net:9091,prodkafkacontr16z4.h.ae-rus.net:9091 num.network.threads=128 background.threads=128num.io.threads=128 socket.send.buffer.bytes=2097152 socket.receive.buffer.bytes=2097152 socket.request.max.bytes=104857600 queued.max.requests=32768offsets.topic.replication.factor=3 transaction.state.log.replication.factor=3log.dirs=/data/kafka1/kafka num.recovery.threads.per.data.dir=16log.flush.interval.messages=10240 log.flush.interval.ms=5000log.retention.hours=48 log.retention.bytes=1073741824 log.segment.bytes=1073741824 log.retention.check.interval.ms=300000group.initial.rebalance.delay.ms=1000auto.leader.rebalance.enable=True leader.imbalance.check.interval.seconds=300 leader.imbalance.per.broker.percentage=20default.replication.factor=3 num.replica.fetchers=64 replica.fetch.max.bytes=41943040 replica.fetch.wait.max.ms=1000 replica.high.watermark.checkpoint.interval.ms=5000 replica.socket.timeout.ms=30000 replica.socket.receive.buffer.bytes=65536 replica.lag.time.max.ms=10000############################# Extra Properties ############################# offsets.topic.num.partitions=150 message.max.bytes=20971520 batch.size=65536 zookeeper.connect=prod-zk-11.ae-rus.net:2181,prod-zk-12.ae-rus.net:2181,prod-zk-13.ae-rus.net:2181,prod-zk-14.ae-rus.net:2181,prod-zk-15.ae-rus.net:2181 rack.aware.assignment.strategy=balance_subtopology rack.aware.assignment.traffic_cost=org.apache.kafka.streams.processor.internals.assignment.RackAwareTaskAssignor compression.gzip.level=5 zookeeper.metadata.migration.enable=True zookeeper.metadata.migration.min.batch.size=512 initial.broker.registration.timeout.ms=18000 {code} The config on the broker node is the following {code:java} $ cat /etc/kafka/server.properties ## # Ansible managed #host.name=prodkafka2z1.h.ae-rus.netcluster.name=MrgOne node.id=2 broker.rack=z1replica.selector.class=org.apache.kafka.common.replica.RackAwareReplicaSelectorprocess.roles=brokerdelete.topic.enable=True auto.create.topics.enable=Truelistener.security.protocol.map=INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT,CONTROLLER:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL listeners=EXTERNAL://:9092,INTERNAL://:29092advertised.listeners=EXTERNAL://prod-kafka-12.ae-rus.net:9092,INTERNAL://prod-kafka-12.ae-rus.net:29092 controller.listener.names=CONTROLLER inter.broker.listener.name=EXTERNALcontroller.quorum.voters=1...@prodkafkacontr11z1.h.ae-rus.net:9091,1...@prodkafkacontr12z1.h.ae-rus.net:9091,1...@prodkafkacontr13z1.h.ae-rus.net:9091,1...@prodkafkacontr17z3.h.ae-rus.net:9091,1...@prodkafkacontr18z3.h.ae-rus.net:9091,1...@prodkafkacontr19z3.h.ae-rus.net:9091,1...@prodkafkacontr14z4.h.ae-rus.net:9091,1...@prodkafkacontr15z4.h.ae-rus.net:9091,1...@prodkafkacontr16z4.h.ae-rus.net:9091controller.quorum.bootstrap.servers=prodkafkacontr11z1.h.ae-rus.net:9091,prodkafkacontr12z1.h.ae-rus.net:9091,prodkafkacontr13z1.h.ae-rus.net:9091,prodkafkacontr17z3.h.ae-rus.net:9091,prodkafkacontr18z3.h.ae-rus.net:9091,prodkafkacontr19z3.h.ae-rus.net:9091,prodkafkacontr14z4.h.ae-rus.net:9091,prodkafkacontr15z4.h.ae-rus.net:9091,prodkafkacontr16z4.h.ae-rus.net:9091 num.network.threads=128 background.threads=128num.io.threads=128 socket.send.buffer.bytes=2097152 socket.receive.buffer.bytes=2097152 socket.request.max.bytes=104857600 queued.max.requests=32768offsets.topic.replication.factor=3 transaction.state.log.replication.factor=3log.dirs=/data/kafka1/kafka,/data/kafka2/kafka,/data/kafka3/kafka,/data/kafka4/kafka,/data/kafka5/kafka,/data/kafka6/kafka num.recovery.threads.per.data.dir=16log.flush.interval.messages=10240 log.flush.interval.ms=5000log.retention.hours=48 log.retention.bytes=1073741824 log.segment.bytes=1073741824 log.retention.check.interval.ms=300000group.initial.rebalance.delay.ms=1000auto.leader.rebalance.enable=True leader.imbalance.check.interval.seconds=300 leader.imbalance.per.broker.percentage=20default.replication.factor=3 num.replica.fetchers=64 replica.fetch.max.bytes=41943040 replica.fetch.wait.max.ms=1000 replica.high.watermark.checkpoint.interval.ms=5000 replica.socket.timeout.ms=30000 replica.socket.receive.buffer.bytes=65536 replica.lag.time.max.ms=10000############################# Extra Properties ############################# offsets.topic.num.partitions=150 message.max.bytes=20971520 batch.size=65536 zookeeper.connect=prod-zk-11.ae-rus.net:2181,prod-zk-12.ae-rus.net:2181,prod-zk-13.ae-rus.net:2181,prod-zk-14.ae-rus.net:2181,prod-zk-15.ae-rus.net:2181 rack.aware.assignment.strategy=balance_subtopology rack.aware.assignment.traffic_cost=org.apache.kafka.streams.processor.internals.assignment.RackAwareTaskAssignor compression.gzip.level=5 zookeeper.metadata.migration.enable=True zookeeper.metadata.migration.min.batch.size=512 initial.broker.registration.timeout.ms=18000 {code} When trying to move to the next step (`Migrating brokers to KRaft`), it fails to get controller quorum and crashes. {code:java} ==> /var/log/kafka/kafka-server.log <== [2025-05-20 15:24:07,521] INFO Registered kafka:type=kafka.Log4jController MBean (kafka.utils.Log4jControllerRegistration$) [2025-05-20 15:24:07,826] INFO [BrokerServer id=2] Transition from SHUTDOWN to STARTING (kafka.server.BrokerServer) [2025-05-20 15:24:07,827] INFO [SharedServer id=2] Starting SharedServer (kafka.server.SharedServer) [2025-05-20 15:24:07,884] INFO [LogLoader partition=__cluster_metadata-0, dir=/data/kafka1/kafka] Recovering unflushed segment 21124. 0/1 recovered for __cluster_metadata-0. (kafka.log.LogLoader) [2025-05-20 15:24:07,885] INFO [LogLoader partition=__cluster_metadata-0, dir=/data/kafka1/kafka] Loading producer state till offset 21124 with message format version 2 (kafka.log.UnifiedLog$) [2025-05-20 15:24:07,886] INFO [LogLoader partition=__cluster_metadata-0, dir=/data/kafka1/kafka] Reloading from producer snapshot and rebuilding producer state from offset 21124 (kafka.log.UnifiedLog$) [2025-05-20 15:24:07,893] INFO [LogLoader partition=__cluster_metadata-0, dir=/data/kafka1/kafka] Producer state recovery took 0ms for snapshot load and 7ms for segment recovery from offset 21124 (kafka.log.UnifiedLog$) [2025-05-20 15:24:07,958] INFO [LogLoader partition=__cluster_metadata-0, dir=/data/kafka1/kafka] Loading producer state till offset 31424 with message format version 2 (kafka.log.UnifiedLog$) [2025-05-20 15:24:07,958] INFO [LogLoader partition=__cluster_metadata-0, dir=/data/kafka1/kafka] Reloading from producer snapshot and rebuilding producer state from offset 31424 (kafka.log.UnifiedLog$) [2025-05-20 15:24:07,960] INFO [LogLoader partition=__cluster_metadata-0, dir=/data/kafka1/kafka] Producer state recovery took 2ms for snapshot load and 0ms for segment recovery from offset 31424 (kafka.log.UnifiedLog$) [2025-05-20 15:24:07,995] INFO Initialized snapshots with IDs SortedSet(OffsetAndEpoch(offset=21124, epoch=7)) from /data/kafka1/kafka/__cluster_metadata-0 (kafka.raft.KafkaMetadataLog$) [2025-05-20 15:24:08,023] INFO [raft-expiration-reaper]: Starting (kafka.raft.TimingWheelExpirationService$ExpiredOperationReaper) [2025-05-20 15:24:08,301] INFO [BrokerServer id=2] Starting broker (kafka.server.BrokerServer) [2025-05-20 15:24:08,318] INFO [broker-2-ThrottledChannelReaper-Fetch]: Starting (kafka.server.ClientQuotaManager$ThrottledChannelReaper) [2025-05-20 15:24:08,319] INFO [broker-2-ThrottledChannelReaper-Produce]: Starting (kafka.server.ClientQuotaManager$ThrottledChannelReaper) [2025-05-20 15:24:08,320] INFO [broker-2-ThrottledChannelReaper-Request]: Starting (kafka.server.ClientQuotaManager$ThrottledChannelReaper) [2025-05-20 15:24:08,322] INFO [broker-2-ThrottledChannelReaper-ControllerMutation]: Starting (kafka.server.ClientQuotaManager$ThrottledChannelReaper) [2025-05-20 15:24:08,365] INFO [BrokerServer id=2] Waiting for controller quorum voters future (kafka.server.BrokerServer) [2025-05-20 15:24:08,365] INFO [BrokerServer id=2] Finished waiting for controller quorum voters future (kafka.server.BrokerServer) [2025-05-20 15:24:08,371] INFO [broker-2-to-controller-forwarding-channel-manager]: Starting (kafka.server.NodeToControllerRequestThread) [2025-05-20 15:24:08,592] INFO Updated connection-accept-rate max connection creation rate to 2147483647 (kafka.network.ConnectionQuotas) [2025-05-20 15:24:08,687] INFO [SocketServer listenerType=BROKER, nodeId=2] Created data-plane acceptor and processors for endpoint : ListenerName(EXTERNAL) (kafka.network.SocketServer) [2025-05-20 15:24:08,688] INFO Updated connection-accept-rate max connection creation rate to 2147483647 (kafka.network.ConnectionQuotas) [2025-05-20 15:24:08,759] INFO [SocketServer listenerType=BROKER, nodeId=2] Created data-plane acceptor and processors for endpoint : ListenerName(INTERNAL) (kafka.network.SocketServer) [2025-05-20 15:24:08,765] INFO [broker-2-to-controller-alter-partition-channel-manager]: Starting (kafka.server.NodeToControllerRequestThread) [2025-05-20 15:24:08,771] INFO [broker-2-to-controller-directory-assignments-channel-manager]: Starting (kafka.server.NodeToControllerRequestThread) [2025-05-20 15:24:08,788] INFO [ExpirationReaper-2-Produce]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) [2025-05-20 15:24:08,788] INFO [ExpirationReaper-2-Fetch]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) [2025-05-20 15:24:08,789] INFO [ExpirationReaper-2-DeleteRecords]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) [2025-05-20 15:24:08,790] INFO [ExpirationReaper-2-ElectLeader]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) [2025-05-20 15:24:08,790] INFO [ExpirationReaper-2-RemoteFetch]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) [2025-05-20 15:24:08,808] INFO [ExpirationReaper-2-Heartbeat]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) [2025-05-20 15:24:08,808] INFO [ExpirationReaper-2-Rebalance]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) [2025-05-20 15:24:08,863] INFO [broker-2-to-controller-heartbeat-channel-manager]: Starting (kafka.server.NodeToControllerRequestThread) [2025-05-20 15:24:08,865] INFO [BrokerLifecycleManager id=2] Incarnation ZvarJwVqQZOUzzobd91HlQ of broker 2 in cluster 9ca1s2tbTGm1Bi35aynf6Q is now STARTING. (kafka.server.BrokerLifecycleManager) [2025-05-20 15:24:08,898] INFO [ExpirationReaper-2-AlterAcls]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) [2025-05-20 15:24:08,978] INFO [BrokerServer id=2] Waiting for the broker metadata publishers to be installed (kafka.server.BrokerServer) [2025-05-20 15:24:08,979] INFO [BrokerServer id=2] Finished waiting for the broker metadata publishers to be installed (kafka.server.BrokerServer) [2025-05-20 15:24:08,979] INFO [BrokerServer id=2] Waiting for the controller to acknowledge that we are caught up (kafka.server.BrokerServer) {code} {code:java} ==> /var/log/kafka/kafka-server.log <== [2025-05-20 15:24:13,380] INFO [BrokerLifecycleManager id=2] Unable to register the broker because the RPC got timed out before it could be sent. (kafka.server.BrokerLifecycleManager) [2025-05-20 15:24:17,990] INFO [BrokerLifecycleManager id=2] Unable to register the broker because the RPC got timed out before it could be sent. (kafka.server.BrokerLifecycleManager) [2025-05-20 15:24:22,701] INFO [BrokerLifecycleManager id=2] Unable to register the broker because the RPC got timed out before it could be sent. (kafka.server.BrokerLifecycleManager) [2025-05-20 15:24:26,864] ERROR [BrokerLifecycleManager id=2] Shutting down because we were unable to register with the controller quorum. (kafka.server.BrokerLifecycleManager) [2025-05-20 15:24:26,864] INFO [BrokerLifecycleManager id=2] Transitioning from STARTING to SHUTTING_DOWN. (kafka.server.BrokerLifecycleManager) [2025-05-20 15:24:26,865] INFO [broker-2-to-controller-heartbeat-channel-manager]: Shutting down (kafka.server.NodeToControllerRequestThread) [2025-05-20 15:24:26,865] INFO [broker-2-to-controller-heartbeat-channel-manager]: Stopped (kafka.server.NodeToControllerRequestThread) [2025-05-20 15:24:26,865] INFO [broker-2-to-controller-heartbeat-channel-manager]: Shutdown completed (kafka.server.NodeToControllerRequestThread) [2025-05-20 15:24:26,865] ERROR [BrokerServer id=2] Received a fatal error while waiting for the controller to acknowledge that we are caught up (kafka.server.BrokerServer) java.util.concurrent.CancellationException at java.base/java.util.concurrent.CompletableFuture.cancel(CompletableFuture.java:2510) at kafka.server.BrokerLifecycleManager$ShutdownEvent.run(BrokerLifecycleManager.scala:638) at org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:191) at java.base/java.lang.Thread.run(Thread.java:1583) [2025-05-20 15:24:26,867] INFO Node to controller channel manager for heartbeat shutdown (kafka.server.NodeToControllerChannelManagerImpl) [2025-05-20 15:24:26,868] INFO [BrokerServer id=2] Transition from STARTING to STARTED (kafka.server.BrokerServer) [2025-05-20 15:24:26,870] ERROR [BrokerServer id=2] Fatal error during broker startup. Prepare to shutdown (kafka.server.BrokerServer) java.lang.RuntimeException: Received a fatal error while waiting for the controller to acknowledge that we are caught up at org.apache.kafka.server.util.FutureUtils.waitWithLogging(FutureUtils.java:73) at kafka.server.BrokerServer.startup(BrokerServer.scala:502) at kafka.server.KafkaRaftServer.$anonfun$startup$2(KafkaRaftServer.scala:99) at kafka.server.KafkaRaftServer.$anonfun$startup$2$adapted(KafkaRaftServer.scala:99) at scala.Option.foreach(Option.scala:437) at kafka.server.KafkaRaftServer.startup(KafkaRaftServer.scala:99) at kafka.Kafka$.main(Kafka.scala:112) at kafka.Kafka.main(Kafka.scala) Caused by: java.util.concurrent.CancellationException at java.base/java.util.concurrent.CompletableFuture.cancel(CompletableFuture.java:2510) at kafka.server.BrokerLifecycleManager$ShutdownEvent.run(BrokerLifecycleManager.scala:638) at org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:191) at java.base/java.lang.Thread.run(Thread.java:1583) [2025-05-20 15:24:26,870] INFO [BrokerServer id=2] Transition from STARTED to SHUTTING_DOWN (kafka.server.BrokerServer) [2025-05-20 15:24:26,871] INFO [BrokerServer id=2] shutting down (kafka.server.BrokerServer) [2025-05-20 15:24:26,871] INFO [SocketServer listenerType=BROKER, nodeId=2] Stopping socket server request processors (kafka.network.SocketServer) [2025-05-20 15:24:26,874] ERROR Cannot invoke "java.nio.channels.ServerSocketChannel.close()" because the return value of "kafka.network.Acceptor.serverChannel()" is null (kafka.network.DataPlaneAcceptor) java.lang.NullPointerException: Cannot invoke "java.nio.channels.ServerSocketChannel.close()" because the return value of "kafka.network.Acceptor.serverChannel()" is null at kafka.network.Acceptor.$anonfun$closeAll$2(SocketServer.scala:711) at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:68) at kafka.network.Acceptor.closeAll(SocketServer.scala:711) at kafka.network.Acceptor.close(SocketServer.scala:678) at kafka.network.SocketServer.$anonfun$stopProcessingRequests$4(SocketServer.scala:287) at kafka.network.SocketServer.$anonfun$stopProcessingRequests$4$adapted(SocketServer.scala:287) at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:619) at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:617) at scala.collection.AbstractIterable.foreach(Iterable.scala:935) at kafka.network.SocketServer.stopProcessingRequests(SocketServer.scala:287) at kafka.server.BrokerServer.$anonfun$shutdown$5(BrokerServer.scala:638) at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:68) at kafka.server.BrokerServer.shutdown(BrokerServer.scala:638) at kafka.server.KafkaBroker.shutdown(KafkaBroker.scala:98) at kafka.server.KafkaBroker.shutdown$(KafkaBroker.scala:98) at kafka.server.BrokerServer.shutdown(BrokerServer.scala:67) at kafka.server.BrokerServer.startup(BrokerServer.scala:555) at kafka.server.KafkaRaftServer.$anonfun$startup$2(KafkaRaftServer.scala:99) at kafka.server.KafkaRaftServer.$anonfun$startup$2$adapted(KafkaRaftServer.scala:99) at scala.Option.foreach(Option.scala:437) at kafka.server.KafkaRaftServer.startup(KafkaRaftServer.scala:99) at kafka.Kafka$.main(Kafka.scala:112) at kafka.Kafka.main(Kafka.scala) [2025-05-20 15:24:26,906] ERROR Cannot invoke "java.nio.channels.ServerSocketChannel.close()" because the return value of "kafka.network.Acceptor.serverChannel()" is null (kafka.network.DataPlaneAcceptor) java.lang.NullPointerException: Cannot invoke "java.nio.channels.ServerSocketChannel.close()" because the return value of "kafka.network.Acceptor.serverChannel()" is null at kafka.network.Acceptor.$anonfun$closeAll$2(SocketServer.scala:711) at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:68) at kafka.network.Acceptor.closeAll(SocketServer.scala:711) at kafka.network.Acceptor.close(SocketServer.scala:678) at kafka.network.SocketServer.$anonfun$stopProcessingRequests$4(SocketServer.scala:287) at kafka.network.SocketServer.$anonfun$stopProcessingRequests$4$adapted(SocketServer.scala:287) at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:619) at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:617) at scala.collection.AbstractIterable.foreach(Iterable.scala:935) at kafka.network.SocketServer.stopProcessingRequests(SocketServer.scala:287) at kafka.server.BrokerServer.$anonfun$shutdown$5(BrokerServer.scala:638) at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:68) at kafka.server.BrokerServer.shutdown(BrokerServer.scala:638) at kafka.server.KafkaBroker.shutdown(KafkaBroker.scala:98) at kafka.server.KafkaBroker.shutdown$(KafkaBroker.scala:98) at kafka.server.BrokerServer.shutdown(BrokerServer.scala:67) at kafka.server.BrokerServer.startup(BrokerServer.scala:555) at kafka.server.KafkaRaftServer.$anonfun$startup$2(KafkaRaftServer.scala:99) at kafka.server.KafkaRaftServer.$anonfun$startup$2$adapted(KafkaRaftServer.scala:99) at scala.Option.foreach(Option.scala:437) at kafka.server.KafkaRaftServer.startup(KafkaRaftServer.scala:99) at kafka.Kafka$.main(Kafka.scala:112) at kafka.Kafka.main(Kafka.scala) [2025-05-20 15:24:26,932] INFO [SocketServer listenerType=BROKER, nodeId=2] Stopped socket server request processors (kafka.network.SocketServer) {code} quorum is ok and accessible from brokers {code:java} root@prodkafka2z1:/data# /opt/kafka/bin/kafka-metadata-quorum.sh --bootstrap-controller prodkafkacontr11z1.h.ae-rus.net:9091 describe --replication NodeId DirectoryId LogEndOffset Lag LastFetchTimestamp LastCaughtUpTimestamp Status 111 L_5_4pNVTlyBl6_go_TkMA 5457 0 1747754839735 1747754839735 Leader 117 uXDetGxGYrzMssuEw-TsKQ 5457 0 1747754839705 1747754839705 Observer 118 EAqK0bysSLhV57oMWfeRbw 5457 0 1747754839705 1747754839705 Observer 112 utKjIX3i9UJU-dZf5e5rfQ 5457 0 1747754839700 1747754839700 Observer 113 gqu36Z80ogEmZ0cflxvAAw 5457 0 1747754839699 1747754839699 Observer 119 UcFbHUd5RGqXoqmsNXYYsA 5457 0 1747754839705 1747754839705 Observer 115 jH9oLOSeGoth0R7w6CPT3A 5457 0 1747754839701 1747754839701 Observer 114 nMGKPes0EA8gB4YS1E7KJw 5457 0 1747754839701 1747754839701 Observer 116 eoiyLM_huf88Ny5ZexL_7w 5457 0 1747754839701 1747754839701 Observer root@prodkafka2z1:/data# /opt/kafka/bin/kafka-metadata-quorum.sh --bootstrap-controller prodkafkacontr11z1.h.ae-rus.net:9091 describe --status ClusterId: 9ca1s2tbTGm1Bi35aynf6Q LeaderId: 111 LeaderEpoch: 2 HighWatermark: 5467 MaxFollowerLag: 0 MaxFollowerLagTimeMs: 0 CurrentVoters: [{"id": 111, "directoryId": "L_5_4pNVTlyBl6_go_TkMA", "endpoints": ["CONTROLLER://prodkafkacontr11z1.h.ae-rus.net:9091"]}] CurrentObservers: [{"id": 117, "directoryId": "uXDetGxGYrzMssuEw-TsKQ"}, {"id": 118, "directoryId": "EAqK0bysSLhV57oMWfeRbw"}, {"id": 112, "directoryId": "utKjIX3i9UJU-dZf5e5rfQ"}, {"id": 113, "directoryId": "gqu36Z80ogEmZ0cflxvAAw"}, {"id": 119, "directoryId": "UcFbHUd5RGqXoqmsNXYYsA"}, {"id": 115, "directoryId": "jH9oLOSeGoth0R7w6CPT3A"}, {"id": 114, "directoryId": "nMGKPes0EA8gB4YS1E7KJw"}, {"id": 116, "directoryId": "eoiyLM_huf88Ny5ZexL_7w"}] {code} was: Despite several attempts to migrate from Zookeeper cluster to Kraft, it failed to properly migrate. We spawn a need cluster fully healthy with 18 Kafka nodes connected to 5 Zookeeper nodes. 9 new Kafka nodes are there for the new controllers. It was tested with Kafka 3.9.0 The controllers are started with: {code:java} ==> /var/log/kafka/kafka-server.log <== [2025-05-20 15:04:28,798] INFO Registered kafka:type=kafka.Log4jController MBean (kafka.utils.Log4jControllerRegistration$) [2025-05-20 15:04:29,207] INFO [ControllerServer id=111] Starting controller (kafka.server.ControllerServer) [2025-05-20 15:04:29,656] INFO Updated connection-accept-rate max connection creation rate to 2147483647 (kafka.network.ConnectionQuotas) [2025-05-20 15:04:29,784] INFO [SocketServer listenerType=CONTROLLER, nodeId=111] Created data-plane acceptor and processors for endpoint : ListenerName(CONTROLLER) (kafka.network.SocketServer) [2025-05-20 15:04:29,792] INFO [SharedServer id=111] Starting SharedServer (kafka.server.SharedServer) [2025-05-20 15:04:29,841] INFO [LogLoader partition=__cluster_metadata-0, dir=/data/kafka1/kafka] Recovering unflushed segment 0. 0/1 recovered for __cluster_metadata-0. (kafka.log.LogLoader) [2025-05-20 15:04:29,842] INFO [LogLoader partition=__cluster_metadata-0, dir=/data/kafka1/kafka] Loading producer state till offset 0 with message format version 2 (kafka.log.UnifiedLog$) [2025-05-20 15:04:29,843] INFO [LogLoader partition=__cluster_metadata-0, dir=/data/kafka1/kafka] Reloading from producer snapshot and rebuilding producer state from offset 0 (kafka.log.UnifiedLog$) [2025-05-20 15:04:29,843] INFO [LogLoader partition=__cluster_metadata-0, dir=/data/kafka1/kafka] Producer state recovery took 0ms for snapshot load and 0ms for segment recovery from offset 0 (kafka.log.UnifiedLog$) [2025-05-20 15:04:29,905] INFO [LogLoader partition=__cluster_metadata-0, dir=/data/kafka1/kafka] Loading producer state till offset 2709 with message format version 2 (kafka.log.UnifiedLog$) [2025-05-20 15:04:29,905] INFO [LogLoader partition=__cluster_metadata-0, dir=/data/kafka1/kafka] Reloading from producer snapshot and rebuilding producer state from offset 2709 (kafka.log.UnifiedLog$) [2025-05-20 15:04:29,906] INFO [LogLoader partition=__cluster_metadata-0, dir=/data/kafka1/kafka] Producer state recovery took 1ms for snapshot load and 0ms for segment recovery from offset 2709 (kafka.log.UnifiedLog$) [2025-05-20 15:04:29,940] INFO Initialized snapshots with IDs SortedSet(OffsetAndEpoch(offset=0, epoch=0)) from /data/kafka1/kafka/__cluster_metadata-0 (kafka.raft.KafkaMetadataLog$) [2025-05-20 15:04:29,952] INFO [raft-expiration-reaper]: Starting (kafka.raft.TimingWheelExpirationService$ExpiredOperationReaper) [2025-05-20 15:04:30,191] INFO [ControllerServer id=111] Waiting for controller quorum voters future (kafka.server.ControllerServer) [2025-05-20 15:04:30,191] INFO [ControllerServer id=111] Finished waiting for controller quorum voters future (kafka.server.ControllerServer) [2025-05-20 15:04:30,233] INFO [ZooKeeperClient KRaft Migration] Initializing a new session to prod-zk-11.ae-rus.net:2181,prod-zk-12.ae-rus.net:2181,prod-zk-13.ae-rus.net:2181,prod-zk-14.ae-rus.net:2181,prod-zk-15.ae-rus.net:2181. (kafka.zookeeper.ZooKeeperClient) [2025-05-20 15:04:30,248] INFO [ZooKeeperClient KRaft Migration] Waiting until connected. (kafka.zookeeper.ZooKeeperClient) [2025-05-20 15:04:30,261] INFO [ZooKeeperClient KRaft Migration] Connected. (kafka.zookeeper.ZooKeeperClient) [2025-05-20 15:04:30,293] INFO [controller-111-ThrottledChannelReaper-Fetch]: Starting (kafka.server.ClientQuotaManager$ThrottledChannelReaper) [2025-05-20 15:04:30,293] INFO [controller-111-ThrottledChannelReaper-Produce]: Starting (kafka.server.ClientQuotaManager$ThrottledChannelReaper) [2025-05-20 15:04:30,294] INFO [controller-111-ThrottledChannelReaper-Request]: Starting (kafka.server.ClientQuotaManager$ThrottledChannelReaper) [2025-05-20 15:04:30,297] INFO [controller-111-ThrottledChannelReaper-ControllerMutation]: Starting (kafka.server.ClientQuotaManager$ThrottledChannelReaper) [2025-05-20 15:04:30,316] INFO [ExpirationReaper-111-AlterAcls]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) [2025-05-20 15:04:30,392] INFO [ControllerServer id=111] Waiting for the controller metadata publishers to be installed (kafka.server.ControllerServer) [2025-05-20 15:04:30,393] INFO [ControllerServer id=111] Finished waiting for the controller metadata publishers to be installed (kafka.server.ControllerServer) [2025-05-20 15:04:30,393] INFO [SocketServer listenerType=CONTROLLER, nodeId=111] Enabling request processing. (kafka.network.SocketServer) [2025-05-20 15:04:30,394] INFO [ControllerRegistrationManager id=111 incarnation=mbreItYzTmySXeahJkXxFA] Found registration for UwcgQ_r6T_uBVDMCdWLG6A instead of our incarnation. (kafka.server.ControllerRegistrationManager) [2025-05-20 15:04:30,395] INFO Awaiting socket connections on 0.0.0.0:9091. (kafka.network.DataPlaneAcceptor) [2025-05-20 15:04:30,455] INFO [controller-111-to-controller-registration-channel-manager]: Starting (kafka.server.NodeToControllerRequestThread) [2025-05-20 15:04:30,455] INFO [ControllerRegistrationManager id=111 incarnation=mbreItYzTmySXeahJkXxFA] initialized channel manager. (kafka.server.ControllerRegistrationManager) [2025-05-20 15:04:30,455] INFO [ControllerServer id=111] Waiting for all of the authorizer futures to be completed (kafka.server.ControllerServer) [2025-05-20 15:04:30,456] INFO [ControllerServer id=111] Finished waiting for all of the authorizer futures to be completed (kafka.server.ControllerServer) [2025-05-20 15:04:30,456] INFO [ControllerServer id=111] Waiting for all of the SocketServer Acceptors to be started (kafka.server.ControllerServer) [2025-05-20 15:04:30,456] INFO [ControllerServer id=111] Finished waiting for all of the SocketServer Acceptors to be started (kafka.server.ControllerServer) [2025-05-20 15:04:30,457] INFO [ControllerRegistrationManager id=111 incarnation=mbreItYzTmySXeahJkXxFA] sendControllerRegistration: attempting to send ControllerRegistrationRequestData(controllerId=111, incarnationId=mbreItYzTmySXeahJkXxFA, zkMigrationReady=true, listeners=[Listener(name='CONTROLLER', host='prodkafkacontr11z1.h.ae-rus.net', port=9091, securityProtocol=0)], features=[Feature(name='kraft.version', minSupportedVersion=0, maxSupportedVersion=1), Feature(name='metadata.version', minSupportedVersion=1, maxSupportedVersion=21)]) (kafka.server.ControllerRegistrationManager) [2025-05-20 15:04:30,471] INFO [controller-111-to-controller-registration-channel-manager]: Recorded new KRaft controller, from now on will use node prodkafkacontr11z1.h.ae-rus.net:9091 (id: 111 rack: null) (kafka.server.NodeToControllerRequestThread) [2025-05-20 15:04:30,505] INFO [KafkaRaftServer nodeId=111] Kafka Server started (kafka.server.KafkaRaftServer) [2025-05-20 15:04:30,579] INFO [ControllerRegistrationManager id=111 incarnation=mbreItYzTmySXeahJkXxFA] Our registration has been persisted to the metadata log. (kafka.server.ControllerRegistrationManager) [2025-05-20 15:04:30,581] INFO [ControllerRegistrationManager id=111 incarnation=mbreItYzTmySXeahJkXxFA] RegistrationResponseHandler: controller acknowledged ControllerRegistrationRequest. (kafka.server.ControllerRegistrationManager){code} {code:java} cat /var/log/kafka/controller.log [2025-05-20 15:04:30,357] TRACE [KRaftMigrationDriver id=111] Recovered migration state from ZK in 68765354 ns. Transitioned migration state from ZkMigrationLeadershipState{kraftControllerId=-1, kraftControllerEpoch=-1, kraftMetadataOffset=-1, kraftMetadataEpoch=-1, lastUpdatedTimeMs=-1, migrationZkVersion=-2, controllerZkEpoch=-1, controllerZkVersion=-2} to ZkMigrationLeadershipState{kraftControllerId=-1, kraftControllerEpoch=-1, kraftMetadataOffset=-1, kraftMetadataEpoch=-1, lastUpdatedTimeMs=1747648116174, migrationZkVersion=0, controllerZkEpoch=-1, controllerZkVersion=-2} (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,358] INFO [KRaftMigrationDriver id=111] Initial migration of ZK metadata is not done. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,358] INFO [KRaftMigrationDriver id=111] 111 transitioning from UNINITIALIZED to INACTIVE state (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,388] INFO [KRaftMigrationDriver id=111] Became active migration driver in 27621312 ns. Transitioned migration state from ZkMigrationLeadershipState{kraftControllerId=-1, kraftControllerEpoch=-1, kraftMetadataOffset=-1, kraftMetadataEpoch=-1, lastUpdatedTimeMs=1747648116174, migrationZkVersion=0, controllerZkEpoch=-1, controllerZkVersion=-2} to ZkMigrationLeadershipState{kraftControllerId=111, kraftControllerEpoch=2, kraftMetadataOffset=-1, kraftMetadataEpoch=-1, lastUpdatedTimeMs=1747648116174, migrationZkVersion=0, controllerZkEpoch=-1, controllerZkVersion=-2} (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,388] INFO [KRaftMigrationDriver id=111] 111 transitioning from INACTIVE to WAIT_FOR_CONTROLLER_QUORUM state (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,388] INFO [KRaftMigrationDriver id=111] Controller Quorum is ready for Zk to KRaft migration. Now waiting for ZK brokers. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,388] INFO [KRaftMigrationDriver id=111] 111 transitioning from WAIT_FOR_CONTROLLER_QUORUM to WAIT_FOR_BROKERS state (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,389] INFO [KRaftMigrationDriver id=111] No brokers are known to KRaft, waiting for brokers to register. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,494] INFO [KRaftMigrationDriver id=111] No brokers are known to KRaft, waiting for brokers to register. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,579] TRACE [KRaftMigrationDriver id=111] Received metadata delta, but the controller is not in dual-write mode. Ignoring this metadata update. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,580] INFO [KRaftMigrationDriver id=111] No brokers are known to KRaft, waiting for brokers to register. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,645] TRACE [KRaftMigrationDriver id=111] Received metadata delta, but the controller is not in dual-write mode. Ignoring this metadata update. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,645] INFO [KRaftMigrationDriver id=111] No brokers are known to KRaft, waiting for brokers to register. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,701] TRACE [KRaftMigrationDriver id=111] Received metadata delta, but the controller is not in dual-write mode. Ignoring this metadata update. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,701] INFO [KRaftMigrationDriver id=111] No brokers are known to KRaft, waiting for brokers to register. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,730] TRACE [KRaftMigrationDriver id=111] Received metadata delta, but the controller is not in dual-write mode. Ignoring this metadata update. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,730] INFO [KRaftMigrationDriver id=111] No brokers are known to KRaft, waiting for brokers to register. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,778] TRACE [KRaftMigrationDriver id=111] Received metadata delta, but the controller is not in dual-write mode. Ignoring this metadata update. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,861] TRACE [KRaftMigrationDriver id=111] Received metadata delta, but the controller is not in dual-write mode. Ignoring this metadata update. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,861] INFO [KRaftMigrationDriver id=111] No brokers are known to KRaft, waiting for brokers to register. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,905] TRACE [KRaftMigrationDriver id=111] Received metadata delta, but the controller is not in dual-write mode. Ignoring this metadata update. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:30,905] INFO [KRaftMigrationDriver id=111] No brokers are known to KRaft, waiting for brokers to register. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:31,202] TRACE [KRaftMigrationDriver id=111] Received metadata delta, but the controller is not in dual-write mode. Ignoring this metadata update. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:31,202] INFO [KRaftMigrationDriver id=111] No brokers are known to KRaft, waiting for brokers to register. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:31,278] TRACE [KRaftMigrationDriver id=111] Received metadata delta, but the controller is not in dual-write mode. Ignoring this metadata update. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) [2025-05-20 15:04:31,778] TRACE [KRaftMigrationDriver id=111] Received metadata delta, but the controller is not in dual-write mode. Ignoring this metadata update. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) {code} The config on the controller node is the following {code:java} cat /etc/kafka/server.properties ## # Ansible managed #host.name=prodkafkacontr14z4.h.ae-rus.netcluster.name=MrgOne node.id=114 controller.rack=z4replica.selector.class=org.apache.kafka.common.replica.RackAwareReplicaSelectorprocess.roles=controllerdelete.topic.enable=True auto.create.topics.enable=Truelistener.security.protocol.map=INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT,CONTROLLER:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL listeners=CONTROLLER://:9091advertised.listeners=CONTROLLER://prodkafkacontr14z4.h.ae-rus.net:9091 controller.listener.names=CONTROLLER inter.broker.listener.name=EXTERNALcontroller.quorum.voters=1...@prodkafkacontr11z1.h.ae-rus.net:9091,1...@prodkafkacontr12z1.h.ae-rus.net:9091,1...@prodkafkacontr13z1.h.ae-rus.net:9091,1...@prodkafkacontr17z3.h.ae-rus.net:9091,1...@prodkafkacontr18z3.h.ae-rus.net:9091,1...@prodkafkacontr19z3.h.ae-rus.net:9091,1...@prodkafkacontr14z4.h.ae-rus.net:9091,1...@prodkafkacontr15z4.h.ae-rus.net:9091,1...@prodkafkacontr16z4.h.ae-rus.net:9091controller.quorum.bootstrap.servers=prodkafkacontr11z1.h.ae-rus.net:9091,prodkafkacontr12z1.h.ae-rus.net:9091,prodkafkacontr13z1.h.ae-rus.net:9091,prodkafkacontr17z3.h.ae-rus.net:9091,prodkafkacontr18z3.h.ae-rus.net:9091,prodkafkacontr19z3.h.ae-rus.net:9091,prodkafkacontr14z4.h.ae-rus.net:9091,prodkafkacontr15z4.h.ae-rus.net:9091,prodkafkacontr16z4.h.ae-rus.net:9091 num.network.threads=128 background.threads=128num.io.threads=128 socket.send.buffer.bytes=2097152 socket.receive.buffer.bytes=2097152 socket.request.max.bytes=104857600 queued.max.requests=32768offsets.topic.replication.factor=3 transaction.state.log.replication.factor=3log.dirs=/data/kafka1/kafka num.recovery.threads.per.data.dir=16log.flush.interval.messages=10240 log.flush.interval.ms=5000log.retention.hours=48 log.retention.bytes=1073741824 log.segment.bytes=1073741824 log.retention.check.interval.ms=300000group.initial.rebalance.delay.ms=1000auto.leader.rebalance.enable=True leader.imbalance.check.interval.seconds=300 leader.imbalance.per.broker.percentage=20default.replication.factor=3 num.replica.fetchers=64 replica.fetch.max.bytes=41943040 replica.fetch.wait.max.ms=1000 replica.high.watermark.checkpoint.interval.ms=5000 replica.socket.timeout.ms=30000 replica.socket.receive.buffer.bytes=65536 replica.lag.time.max.ms=10000############################# Extra Properties ############################# offsets.topic.num.partitions=150 message.max.bytes=20971520 batch.size=65536 zookeeper.connect=prod-zk-11.ae-rus.net:2181,prod-zk-12.ae-rus.net:2181,prod-zk-13.ae-rus.net:2181,prod-zk-14.ae-rus.net:2181,prod-zk-15.ae-rus.net:2181 rack.aware.assignment.strategy=balance_subtopology rack.aware.assignment.traffic_cost=org.apache.kafka.streams.processor.internals.assignment.RackAwareTaskAssignor compression.gzip.level=5 zookeeper.metadata.migration.enable=True zookeeper.metadata.migration.min.batch.size=512 initial.broker.registration.timeout.ms=18000 {code} The config on the broker node is the following {code:java} $ cat /etc/kafka/server.properties ## # Ansible managed #host.name=prodkafka2z1.h.ae-rus.netcluster.name=MrgOne node.id=2 broker.rack=z1replica.selector.class=org.apache.kafka.common.replica.RackAwareReplicaSelectorprocess.roles=brokerdelete.topic.enable=True auto.create.topics.enable=Truelistener.security.protocol.map=INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT,CONTROLLER:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL listeners=EXTERNAL://:9092,INTERNAL://:29092advertised.listeners=EXTERNAL://prod-kafka-12.ae-rus.net:9092,INTERNAL://prod-kafka-12.ae-rus.net:29092 controller.listener.names=CONTROLLER inter.broker.listener.name=EXTERNALcontroller.quorum.voters=1...@prodkafkacontr11z1.h.ae-rus.net:9091,1...@prodkafkacontr12z1.h.ae-rus.net:9091,1...@prodkafkacontr13z1.h.ae-rus.net:9091,1...@prodkafkacontr17z3.h.ae-rus.net:9091,1...@prodkafkacontr18z3.h.ae-rus.net:9091,1...@prodkafkacontr19z3.h.ae-rus.net:9091,1...@prodkafkacontr14z4.h.ae-rus.net:9091,1...@prodkafkacontr15z4.h.ae-rus.net:9091,1...@prodkafkacontr16z4.h.ae-rus.net:9091controller.quorum.bootstrap.servers=prodkafkacontr11z1.h.ae-rus.net:9091,prodkafkacontr12z1.h.ae-rus.net:9091,prodkafkacontr13z1.h.ae-rus.net:9091,prodkafkacontr17z3.h.ae-rus.net:9091,prodkafkacontr18z3.h.ae-rus.net:9091,prodkafkacontr19z3.h.ae-rus.net:9091,prodkafkacontr14z4.h.ae-rus.net:9091,prodkafkacontr15z4.h.ae-rus.net:9091,prodkafkacontr16z4.h.ae-rus.net:9091 num.network.threads=128 background.threads=128num.io.threads=128 socket.send.buffer.bytes=2097152 socket.receive.buffer.bytes=2097152 socket.request.max.bytes=104857600 queued.max.requests=32768offsets.topic.replication.factor=3 transaction.state.log.replication.factor=3log.dirs=/data/kafka1/kafka,/data/kafka2/kafka,/data/kafka3/kafka,/data/kafka4/kafka,/data/kafka5/kafka,/data/kafka6/kafka num.recovery.threads.per.data.dir=16log.flush.interval.messages=10240 log.flush.interval.ms=5000log.retention.hours=48 log.retention.bytes=1073741824 log.segment.bytes=1073741824 log.retention.check.interval.ms=300000group.initial.rebalance.delay.ms=1000auto.leader.rebalance.enable=True leader.imbalance.check.interval.seconds=300 leader.imbalance.per.broker.percentage=20default.replication.factor=3 num.replica.fetchers=64 replica.fetch.max.bytes=41943040 replica.fetch.wait.max.ms=1000 replica.high.watermark.checkpoint.interval.ms=5000 replica.socket.timeout.ms=30000 replica.socket.receive.buffer.bytes=65536 replica.lag.time.max.ms=10000############################# Extra Properties ############################# offsets.topic.num.partitions=150 message.max.bytes=20971520 batch.size=65536 zookeeper.connect=prod-zk-11.ae-rus.net:2181,prod-zk-12.ae-rus.net:2181,prod-zk-13.ae-rus.net:2181,prod-zk-14.ae-rus.net:2181,prod-zk-15.ae-rus.net:2181 rack.aware.assignment.strategy=balance_subtopology rack.aware.assignment.traffic_cost=org.apache.kafka.streams.processor.internals.assignment.RackAwareTaskAssignor compression.gzip.level=5 zookeeper.metadata.migration.enable=True zookeeper.metadata.migration.min.batch.size=512 initial.broker.registration.timeout.ms=18000 {code} When trying to move to the next step (`Migrating brokers to KRaft`), it fails to get controller quorum and crashes. {code:java} ==> /var/log/kafka/kafka-server.log <== [2025-05-20 15:24:07,521] INFO Registered kafka:type=kafka.Log4jController MBean (kafka.utils.Log4jControllerRegistration$) [2025-05-20 15:24:07,826] INFO [BrokerServer id=2] Transition from SHUTDOWN to STARTING (kafka.server.BrokerServer) [2025-05-20 15:24:07,827] INFO [SharedServer id=2] Starting SharedServer (kafka.server.SharedServer) [2025-05-20 15:24:07,884] INFO [LogLoader partition=__cluster_metadata-0, dir=/data/kafka1/kafka] Recovering unflushed segment 21124. 0/1 recovered for __cluster_metadata-0. (kafka.log.LogLoader) [2025-05-20 15:24:07,885] INFO [LogLoader partition=__cluster_metadata-0, dir=/data/kafka1/kafka] Loading producer state till offset 21124 with message format version 2 (kafka.log.UnifiedLog$) [2025-05-20 15:24:07,886] INFO [LogLoader partition=__cluster_metadata-0, dir=/data/kafka1/kafka] Reloading from producer snapshot and rebuilding producer state from offset 21124 (kafka.log.UnifiedLog$) [2025-05-20 15:24:07,893] INFO [LogLoader partition=__cluster_metadata-0, dir=/data/kafka1/kafka] Producer state recovery took 0ms for snapshot load and 7ms for segment recovery from offset 21124 (kafka.log.UnifiedLog$) [2025-05-20 15:24:07,958] INFO [LogLoader partition=__cluster_metadata-0, dir=/data/kafka1/kafka] Loading producer state till offset 31424 with message format version 2 (kafka.log.UnifiedLog$) [2025-05-20 15:24:07,958] INFO [LogLoader partition=__cluster_metadata-0, dir=/data/kafka1/kafka] Reloading from producer snapshot and rebuilding producer state from offset 31424 (kafka.log.UnifiedLog$) [2025-05-20 15:24:07,960] INFO [LogLoader partition=__cluster_metadata-0, dir=/data/kafka1/kafka] Producer state recovery took 2ms for snapshot load and 0ms for segment recovery from offset 31424 (kafka.log.UnifiedLog$) [2025-05-20 15:24:07,995] INFO Initialized snapshots with IDs SortedSet(OffsetAndEpoch(offset=21124, epoch=7)) from /data/kafka1/kafka/__cluster_metadata-0 (kafka.raft.KafkaMetadataLog$) [2025-05-20 15:24:08,023] INFO [raft-expiration-reaper]: Starting (kafka.raft.TimingWheelExpirationService$ExpiredOperationReaper) [2025-05-20 15:24:08,301] INFO [BrokerServer id=2] Starting broker (kafka.server.BrokerServer) [2025-05-20 15:24:08,318] INFO [broker-2-ThrottledChannelReaper-Fetch]: Starting (kafka.server.ClientQuotaManager$ThrottledChannelReaper) [2025-05-20 15:24:08,319] INFO [broker-2-ThrottledChannelReaper-Produce]: Starting (kafka.server.ClientQuotaManager$ThrottledChannelReaper) [2025-05-20 15:24:08,320] INFO [broker-2-ThrottledChannelReaper-Request]: Starting (kafka.server.ClientQuotaManager$ThrottledChannelReaper) [2025-05-20 15:24:08,322] INFO [broker-2-ThrottledChannelReaper-ControllerMutation]: Starting (kafka.server.ClientQuotaManager$ThrottledChannelReaper) [2025-05-20 15:24:08,365] INFO [BrokerServer id=2] Waiting for controller quorum voters future (kafka.server.BrokerServer) [2025-05-20 15:24:08,365] INFO [BrokerServer id=2] Finished waiting for controller quorum voters future (kafka.server.BrokerServer) [2025-05-20 15:24:08,371] INFO [broker-2-to-controller-forwarding-channel-manager]: Starting (kafka.server.NodeToControllerRequestThread) [2025-05-20 15:24:08,592] INFO Updated connection-accept-rate max connection creation rate to 2147483647 (kafka.network.ConnectionQuotas) [2025-05-20 15:24:08,687] INFO [SocketServer listenerType=BROKER, nodeId=2] Created data-plane acceptor and processors for endpoint : ListenerName(EXTERNAL) (kafka.network.SocketServer) [2025-05-20 15:24:08,688] INFO Updated connection-accept-rate max connection creation rate to 2147483647 (kafka.network.ConnectionQuotas) [2025-05-20 15:24:08,759] INFO [SocketServer listenerType=BROKER, nodeId=2] Created data-plane acceptor and processors for endpoint : ListenerName(INTERNAL) (kafka.network.SocketServer) [2025-05-20 15:24:08,765] INFO [broker-2-to-controller-alter-partition-channel-manager]: Starting (kafka.server.NodeToControllerRequestThread) [2025-05-20 15:24:08,771] INFO [broker-2-to-controller-directory-assignments-channel-manager]: Starting (kafka.server.NodeToControllerRequestThread) [2025-05-20 15:24:08,788] INFO [ExpirationReaper-2-Produce]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) [2025-05-20 15:24:08,788] INFO [ExpirationReaper-2-Fetch]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) [2025-05-20 15:24:08,789] INFO [ExpirationReaper-2-DeleteRecords]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) [2025-05-20 15:24:08,790] INFO [ExpirationReaper-2-ElectLeader]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) [2025-05-20 15:24:08,790] INFO [ExpirationReaper-2-RemoteFetch]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) [2025-05-20 15:24:08,808] INFO [ExpirationReaper-2-Heartbeat]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) [2025-05-20 15:24:08,808] INFO [ExpirationReaper-2-Rebalance]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) [2025-05-20 15:24:08,863] INFO [broker-2-to-controller-heartbeat-channel-manager]: Starting (kafka.server.NodeToControllerRequestThread) [2025-05-20 15:24:08,865] INFO [BrokerLifecycleManager id=2] Incarnation ZvarJwVqQZOUzzobd91HlQ of broker 2 in cluster 9ca1s2tbTGm1Bi35aynf6Q is now STARTING. (kafka.server.BrokerLifecycleManager) [2025-05-20 15:24:08,898] INFO [ExpirationReaper-2-AlterAcls]: Starting (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) [2025-05-20 15:24:08,978] INFO [BrokerServer id=2] Waiting for the broker metadata publishers to be installed (kafka.server.BrokerServer) [2025-05-20 15:24:08,979] INFO [BrokerServer id=2] Finished waiting for the broker metadata publishers to be installed (kafka.server.BrokerServer) [2025-05-20 15:24:08,979] INFO [BrokerServer id=2] Waiting for the controller to acknowledge that we are caught up (kafka.server.BrokerServer) {code} {code:java} ==> /var/log/kafka/kafka-server.log <== [2025-05-20 15:24:13,380] INFO [BrokerLifecycleManager id=2] Unable to register the broker because the RPC got timed out before it could be sent. (kafka.server.BrokerLifecycleManager) [2025-05-20 15:24:17,990] INFO [BrokerLifecycleManager id=2] Unable to register the broker because the RPC got timed out before it could be sent. (kafka.server.BrokerLifecycleManager) [2025-05-20 15:24:22,701] INFO [BrokerLifecycleManager id=2] Unable to register the broker because the RPC got timed out before it could be sent. (kafka.server.BrokerLifecycleManager) [2025-05-20 15:24:26,864] ERROR [BrokerLifecycleManager id=2] Shutting down because we were unable to register with the controller quorum. (kafka.server.BrokerLifecycleManager) [2025-05-20 15:24:26,864] INFO [BrokerLifecycleManager id=2] Transitioning from STARTING to SHUTTING_DOWN. (kafka.server.BrokerLifecycleManager) [2025-05-20 15:24:26,865] INFO [broker-2-to-controller-heartbeat-channel-manager]: Shutting down (kafka.server.NodeToControllerRequestThread) [2025-05-20 15:24:26,865] INFO [broker-2-to-controller-heartbeat-channel-manager]: Stopped (kafka.server.NodeToControllerRequestThread) [2025-05-20 15:24:26,865] INFO [broker-2-to-controller-heartbeat-channel-manager]: Shutdown completed (kafka.server.NodeToControllerRequestThread) [2025-05-20 15:24:26,865] ERROR [BrokerServer id=2] Received a fatal error while waiting for the controller to acknowledge that we are caught up (kafka.server.BrokerServer) java.util.concurrent.CancellationException at java.base/java.util.concurrent.CompletableFuture.cancel(CompletableFuture.java:2510) at kafka.server.BrokerLifecycleManager$ShutdownEvent.run(BrokerLifecycleManager.scala:638) at org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:191) at java.base/java.lang.Thread.run(Thread.java:1583) [2025-05-20 15:24:26,867] INFO Node to controller channel manager for heartbeat shutdown (kafka.server.NodeToControllerChannelManagerImpl) [2025-05-20 15:24:26,868] INFO [BrokerServer id=2] Transition from STARTING to STARTED (kafka.server.BrokerServer) [2025-05-20 15:24:26,870] ERROR [BrokerServer id=2] Fatal error during broker startup. Prepare to shutdown (kafka.server.BrokerServer) java.lang.RuntimeException: Received a fatal error while waiting for the controller to acknowledge that we are caught up at org.apache.kafka.server.util.FutureUtils.waitWithLogging(FutureUtils.java:73) at kafka.server.BrokerServer.startup(BrokerServer.scala:502) at kafka.server.KafkaRaftServer.$anonfun$startup$2(KafkaRaftServer.scala:99) at kafka.server.KafkaRaftServer.$anonfun$startup$2$adapted(KafkaRaftServer.scala:99) at scala.Option.foreach(Option.scala:437) at kafka.server.KafkaRaftServer.startup(KafkaRaftServer.scala:99) at kafka.Kafka$.main(Kafka.scala:112) at kafka.Kafka.main(Kafka.scala) Caused by: java.util.concurrent.CancellationException at java.base/java.util.concurrent.CompletableFuture.cancel(CompletableFuture.java:2510) at kafka.server.BrokerLifecycleManager$ShutdownEvent.run(BrokerLifecycleManager.scala:638) at org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:191) at java.base/java.lang.Thread.run(Thread.java:1583) [2025-05-20 15:24:26,870] INFO [BrokerServer id=2] Transition from STARTED to SHUTTING_DOWN (kafka.server.BrokerServer) [2025-05-20 15:24:26,871] INFO [BrokerServer id=2] shutting down (kafka.server.BrokerServer) [2025-05-20 15:24:26,871] INFO [SocketServer listenerType=BROKER, nodeId=2] Stopping socket server request processors (kafka.network.SocketServer) [2025-05-20 15:24:26,874] ERROR Cannot invoke "java.nio.channels.ServerSocketChannel.close()" because the return value of "kafka.network.Acceptor.serverChannel()" is null (kafka.network.DataPlaneAcceptor) java.lang.NullPointerException: Cannot invoke "java.nio.channels.ServerSocketChannel.close()" because the return value of "kafka.network.Acceptor.serverChannel()" is null at kafka.network.Acceptor.$anonfun$closeAll$2(SocketServer.scala:711) at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:68) at kafka.network.Acceptor.closeAll(SocketServer.scala:711) at kafka.network.Acceptor.close(SocketServer.scala:678) at kafka.network.SocketServer.$anonfun$stopProcessingRequests$4(SocketServer.scala:287) at kafka.network.SocketServer.$anonfun$stopProcessingRequests$4$adapted(SocketServer.scala:287) at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:619) at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:617) at scala.collection.AbstractIterable.foreach(Iterable.scala:935) at kafka.network.SocketServer.stopProcessingRequests(SocketServer.scala:287) at kafka.server.BrokerServer.$anonfun$shutdown$5(BrokerServer.scala:638) at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:68) at kafka.server.BrokerServer.shutdown(BrokerServer.scala:638) at kafka.server.KafkaBroker.shutdown(KafkaBroker.scala:98) at kafka.server.KafkaBroker.shutdown$(KafkaBroker.scala:98) at kafka.server.BrokerServer.shutdown(BrokerServer.scala:67) at kafka.server.BrokerServer.startup(BrokerServer.scala:555) at kafka.server.KafkaRaftServer.$anonfun$startup$2(KafkaRaftServer.scala:99) at kafka.server.KafkaRaftServer.$anonfun$startup$2$adapted(KafkaRaftServer.scala:99) at scala.Option.foreach(Option.scala:437) at kafka.server.KafkaRaftServer.startup(KafkaRaftServer.scala:99) at kafka.Kafka$.main(Kafka.scala:112) at kafka.Kafka.main(Kafka.scala) [2025-05-20 15:24:26,906] ERROR Cannot invoke "java.nio.channels.ServerSocketChannel.close()" because the return value of "kafka.network.Acceptor.serverChannel()" is null (kafka.network.DataPlaneAcceptor) java.lang.NullPointerException: Cannot invoke "java.nio.channels.ServerSocketChannel.close()" because the return value of "kafka.network.Acceptor.serverChannel()" is null at kafka.network.Acceptor.$anonfun$closeAll$2(SocketServer.scala:711) at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:68) at kafka.network.Acceptor.closeAll(SocketServer.scala:711) at kafka.network.Acceptor.close(SocketServer.scala:678) at kafka.network.SocketServer.$anonfun$stopProcessingRequests$4(SocketServer.scala:287) at kafka.network.SocketServer.$anonfun$stopProcessingRequests$4$adapted(SocketServer.scala:287) at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:619) at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:617) at scala.collection.AbstractIterable.foreach(Iterable.scala:935) at kafka.network.SocketServer.stopProcessingRequests(SocketServer.scala:287) at kafka.server.BrokerServer.$anonfun$shutdown$5(BrokerServer.scala:638) at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:68) at kafka.server.BrokerServer.shutdown(BrokerServer.scala:638) at kafka.server.KafkaBroker.shutdown(KafkaBroker.scala:98) at kafka.server.KafkaBroker.shutdown$(KafkaBroker.scala:98) at kafka.server.BrokerServer.shutdown(BrokerServer.scala:67) at kafka.server.BrokerServer.startup(BrokerServer.scala:555) at kafka.server.KafkaRaftServer.$anonfun$startup$2(KafkaRaftServer.scala:99) at kafka.server.KafkaRaftServer.$anonfun$startup$2$adapted(KafkaRaftServer.scala:99) at scala.Option.foreach(Option.scala:437) at kafka.server.KafkaRaftServer.startup(KafkaRaftServer.scala:99) at kafka.Kafka$.main(Kafka.scala:112) at kafka.Kafka.main(Kafka.scala) [2025-05-20 15:24:26,932] INFO [SocketServer listenerType=BROKER, nodeId=2] Stopped socket server request processors (kafka.network.SocketServer) {code} > Zookeeper-Kraft failing migration - RPC got timed out before it could be sent > ----------------------------------------------------------------------------- > > Key: KAFKA-19318 > URL: https://issues.apache.org/jira/browse/KAFKA-19318 > Project: Kafka > Issue Type: Bug > Components: kraft > Affects Versions: 3.9.0 > Reporter: dennis lucero > Priority: Major > > Despite several attempts to migrate from Zookeeper cluster to Kraft, it > failed to properly migrate. > We spawn a need cluster fully healthy with 18 Kafka nodes connected to 5 > Zookeeper nodes. 9 new Kafka nodes are there for the new controllers. > It was tested with Kafka 3.9.0 > The controllers are started with: > {code:java} > ==> /var/log/kafka/kafka-server.log <== > [2025-05-20 15:04:28,798] INFO Registered kafka:type=kafka.Log4jController > MBean (kafka.utils.Log4jControllerRegistration$) > [2025-05-20 15:04:29,207] INFO [ControllerServer id=111] Starting controller > (kafka.server.ControllerServer) > [2025-05-20 15:04:29,656] INFO Updated connection-accept-rate max connection > creation rate to 2147483647 (kafka.network.ConnectionQuotas) > [2025-05-20 15:04:29,784] INFO [SocketServer listenerType=CONTROLLER, > nodeId=111] Created data-plane acceptor and processors for endpoint : > ListenerName(CONTROLLER) (kafka.network.SocketServer) > [2025-05-20 15:04:29,792] INFO [SharedServer id=111] Starting SharedServer > (kafka.server.SharedServer) > [2025-05-20 15:04:29,841] INFO [LogLoader partition=__cluster_metadata-0, > dir=/data/kafka1/kafka] Recovering unflushed segment 0. 0/1 recovered for > __cluster_metadata-0. (kafka.log.LogLoader) > [2025-05-20 15:04:29,842] INFO [LogLoader partition=__cluster_metadata-0, > dir=/data/kafka1/kafka] Loading producer state till offset 0 with message > format version 2 (kafka.log.UnifiedLog$) > [2025-05-20 15:04:29,843] INFO [LogLoader partition=__cluster_metadata-0, > dir=/data/kafka1/kafka] Reloading from producer snapshot and rebuilding > producer state from offset 0 (kafka.log.UnifiedLog$) > [2025-05-20 15:04:29,843] INFO [LogLoader partition=__cluster_metadata-0, > dir=/data/kafka1/kafka] Producer state recovery took 0ms for snapshot load > and 0ms for segment recovery from offset 0 (kafka.log.UnifiedLog$) > [2025-05-20 15:04:29,905] INFO [LogLoader partition=__cluster_metadata-0, > dir=/data/kafka1/kafka] Loading producer state till offset 2709 with message > format version 2 (kafka.log.UnifiedLog$) > [2025-05-20 15:04:29,905] INFO [LogLoader partition=__cluster_metadata-0, > dir=/data/kafka1/kafka] Reloading from producer snapshot and rebuilding > producer state from offset 2709 (kafka.log.UnifiedLog$) > [2025-05-20 15:04:29,906] INFO [LogLoader partition=__cluster_metadata-0, > dir=/data/kafka1/kafka] Producer state recovery took 1ms for snapshot load > and 0ms for segment recovery from offset 2709 (kafka.log.UnifiedLog$) > [2025-05-20 15:04:29,940] INFO Initialized snapshots with IDs > SortedSet(OffsetAndEpoch(offset=0, epoch=0)) from > /data/kafka1/kafka/__cluster_metadata-0 (kafka.raft.KafkaMetadataLog$) > [2025-05-20 15:04:29,952] INFO [raft-expiration-reaper]: Starting > (kafka.raft.TimingWheelExpirationService$ExpiredOperationReaper) > [2025-05-20 15:04:30,191] INFO [ControllerServer id=111] Waiting for > controller quorum voters future (kafka.server.ControllerServer) > [2025-05-20 15:04:30,191] INFO [ControllerServer id=111] Finished waiting for > controller quorum voters future (kafka.server.ControllerServer) > [2025-05-20 15:04:30,233] INFO [ZooKeeperClient KRaft Migration] Initializing > a new session to > prod-zk-11.ae-rus.net:2181,prod-zk-12.ae-rus.net:2181,prod-zk-13.ae-rus.net:2181,prod-zk-14.ae-rus.net:2181,prod-zk-15.ae-rus.net:2181. > (kafka.zookeeper.ZooKeeperClient) > [2025-05-20 15:04:30,248] INFO [ZooKeeperClient KRaft Migration] Waiting > until connected. (kafka.zookeeper.ZooKeeperClient) > [2025-05-20 15:04:30,261] INFO [ZooKeeperClient KRaft Migration] Connected. > (kafka.zookeeper.ZooKeeperClient) > [2025-05-20 15:04:30,293] INFO [controller-111-ThrottledChannelReaper-Fetch]: > Starting (kafka.server.ClientQuotaManager$ThrottledChannelReaper) > [2025-05-20 15:04:30,293] INFO > [controller-111-ThrottledChannelReaper-Produce]: Starting > (kafka.server.ClientQuotaManager$ThrottledChannelReaper) > [2025-05-20 15:04:30,294] INFO > [controller-111-ThrottledChannelReaper-Request]: Starting > (kafka.server.ClientQuotaManager$ThrottledChannelReaper) > [2025-05-20 15:04:30,297] INFO > [controller-111-ThrottledChannelReaper-ControllerMutation]: Starting > (kafka.server.ClientQuotaManager$ThrottledChannelReaper) > [2025-05-20 15:04:30,316] INFO [ExpirationReaper-111-AlterAcls]: Starting > (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) > [2025-05-20 15:04:30,392] INFO [ControllerServer id=111] Waiting for the > controller metadata publishers to be installed (kafka.server.ControllerServer) > [2025-05-20 15:04:30,393] INFO [ControllerServer id=111] Finished waiting for > the controller metadata publishers to be installed > (kafka.server.ControllerServer) > [2025-05-20 15:04:30,393] INFO [SocketServer listenerType=CONTROLLER, > nodeId=111] Enabling request processing. (kafka.network.SocketServer) > [2025-05-20 15:04:30,394] INFO [ControllerRegistrationManager id=111 > incarnation=mbreItYzTmySXeahJkXxFA] Found registration for > UwcgQ_r6T_uBVDMCdWLG6A instead of our incarnation. > (kafka.server.ControllerRegistrationManager) > [2025-05-20 15:04:30,395] INFO Awaiting socket connections on 0.0.0.0:9091. > (kafka.network.DataPlaneAcceptor) > [2025-05-20 15:04:30,455] INFO > [controller-111-to-controller-registration-channel-manager]: Starting > (kafka.server.NodeToControllerRequestThread) > [2025-05-20 15:04:30,455] INFO [ControllerRegistrationManager id=111 > incarnation=mbreItYzTmySXeahJkXxFA] initialized channel manager. > (kafka.server.ControllerRegistrationManager) > [2025-05-20 15:04:30,455] INFO [ControllerServer id=111] Waiting for all of > the authorizer futures to be completed (kafka.server.ControllerServer) > [2025-05-20 15:04:30,456] INFO [ControllerServer id=111] Finished waiting for > all of the authorizer futures to be completed (kafka.server.ControllerServer) > [2025-05-20 15:04:30,456] INFO [ControllerServer id=111] Waiting for all of > the SocketServer Acceptors to be started (kafka.server.ControllerServer) > [2025-05-20 15:04:30,456] INFO [ControllerServer id=111] Finished waiting for > all of the SocketServer Acceptors to be started > (kafka.server.ControllerServer) > [2025-05-20 15:04:30,457] INFO [ControllerRegistrationManager id=111 > incarnation=mbreItYzTmySXeahJkXxFA] sendControllerRegistration: attempting to > send ControllerRegistrationRequestData(controllerId=111, > incarnationId=mbreItYzTmySXeahJkXxFA, zkMigrationReady=true, > listeners=[Listener(name='CONTROLLER', > host='prodkafkacontr11z1.h.ae-rus.net', port=9091, securityProtocol=0)], > features=[Feature(name='kraft.version', minSupportedVersion=0, > maxSupportedVersion=1), Feature(name='metadata.version', > minSupportedVersion=1, maxSupportedVersion=21)]) > (kafka.server.ControllerRegistrationManager) > [2025-05-20 15:04:30,471] INFO > [controller-111-to-controller-registration-channel-manager]: Recorded new > KRaft controller, from now on will use node > prodkafkacontr11z1.h.ae-rus.net:9091 (id: 111 rack: null) > (kafka.server.NodeToControllerRequestThread) > [2025-05-20 15:04:30,505] INFO [KafkaRaftServer nodeId=111] Kafka Server > started (kafka.server.KafkaRaftServer) > [2025-05-20 15:04:30,579] INFO [ControllerRegistrationManager id=111 > incarnation=mbreItYzTmySXeahJkXxFA] Our registration has been persisted to > the metadata log. (kafka.server.ControllerRegistrationManager) > [2025-05-20 15:04:30,581] INFO [ControllerRegistrationManager id=111 > incarnation=mbreItYzTmySXeahJkXxFA] RegistrationResponseHandler: controller > acknowledged ControllerRegistrationRequest. > (kafka.server.ControllerRegistrationManager){code} > {code:java} > cat /var/log/kafka/controller.log > [2025-05-20 15:04:30,357] TRACE [KRaftMigrationDriver id=111] Recovered > migration state from ZK in 68765354 ns. Transitioned migration state from > ZkMigrationLeadershipState{kraftControllerId=-1, kraftControllerEpoch=-1, > kraftMetadataOffset=-1, kraftMetadataEpoch=-1, lastUpdatedTimeMs=-1, > migrationZkVersion=-2, controllerZkEpoch=-1, controllerZkVersion=-2} to > ZkMigrationLeadershipState{kraftControllerId=-1, kraftControllerEpoch=-1, > kraftMetadataOffset=-1, kraftMetadataEpoch=-1, > lastUpdatedTimeMs=1747648116174, migrationZkVersion=0, controllerZkEpoch=-1, > controllerZkVersion=-2} > (org.apache.kafka.metadata.migration.KRaftMigrationDriver) > [2025-05-20 15:04:30,358] INFO [KRaftMigrationDriver id=111] Initial > migration of ZK metadata is not done. > (org.apache.kafka.metadata.migration.KRaftMigrationDriver) > [2025-05-20 15:04:30,358] INFO [KRaftMigrationDriver id=111] 111 > transitioning from UNINITIALIZED to INACTIVE state > (org.apache.kafka.metadata.migration.KRaftMigrationDriver) > [2025-05-20 15:04:30,388] INFO [KRaftMigrationDriver id=111] Became active > migration driver in 27621312 ns. Transitioned migration state from > ZkMigrationLeadershipState{kraftControllerId=-1, kraftControllerEpoch=-1, > kraftMetadataOffset=-1, kraftMetadataEpoch=-1, > lastUpdatedTimeMs=1747648116174, migrationZkVersion=0, controllerZkEpoch=-1, > controllerZkVersion=-2} to ZkMigrationLeadershipState{kraftControllerId=111, > kraftControllerEpoch=2, kraftMetadataOffset=-1, kraftMetadataEpoch=-1, > lastUpdatedTimeMs=1747648116174, migrationZkVersion=0, controllerZkEpoch=-1, > controllerZkVersion=-2} > (org.apache.kafka.metadata.migration.KRaftMigrationDriver) > [2025-05-20 15:04:30,388] INFO [KRaftMigrationDriver id=111] 111 > transitioning from INACTIVE to WAIT_FOR_CONTROLLER_QUORUM state > (org.apache.kafka.metadata.migration.KRaftMigrationDriver) > [2025-05-20 15:04:30,388] INFO [KRaftMigrationDriver id=111] Controller > Quorum is ready for Zk to KRaft migration. Now waiting for ZK brokers. > (org.apache.kafka.metadata.migration.KRaftMigrationDriver) > [2025-05-20 15:04:30,388] INFO [KRaftMigrationDriver id=111] 111 > transitioning from WAIT_FOR_CONTROLLER_QUORUM to WAIT_FOR_BROKERS state > (org.apache.kafka.metadata.migration.KRaftMigrationDriver) > [2025-05-20 15:04:30,389] INFO [KRaftMigrationDriver id=111] No brokers are > known to KRaft, waiting for brokers to register. > (org.apache.kafka.metadata.migration.KRaftMigrationDriver) > [2025-05-20 15:04:30,494] INFO [KRaftMigrationDriver id=111] No brokers are > known to KRaft, waiting for brokers to register. > (org.apache.kafka.metadata.migration.KRaftMigrationDriver) > [2025-05-20 15:04:30,579] TRACE [KRaftMigrationDriver id=111] Received > metadata delta, but the controller is not in dual-write mode. Ignoring this > metadata update. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) > [2025-05-20 15:04:30,580] INFO [KRaftMigrationDriver id=111] No brokers are > known to KRaft, waiting for brokers to register. > (org.apache.kafka.metadata.migration.KRaftMigrationDriver) > [2025-05-20 15:04:30,645] TRACE [KRaftMigrationDriver id=111] Received > metadata delta, but the controller is not in dual-write mode. Ignoring this > metadata update. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) > [2025-05-20 15:04:30,645] INFO [KRaftMigrationDriver id=111] No brokers are > known to KRaft, waiting for brokers to register. > (org.apache.kafka.metadata.migration.KRaftMigrationDriver) > [2025-05-20 15:04:30,701] TRACE [KRaftMigrationDriver id=111] Received > metadata delta, but the controller is not in dual-write mode. Ignoring this > metadata update. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) > [2025-05-20 15:04:30,701] INFO [KRaftMigrationDriver id=111] No brokers are > known to KRaft, waiting for brokers to register. > (org.apache.kafka.metadata.migration.KRaftMigrationDriver) > [2025-05-20 15:04:30,730] TRACE [KRaftMigrationDriver id=111] Received > metadata delta, but the controller is not in dual-write mode. Ignoring this > metadata update. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) > [2025-05-20 15:04:30,730] INFO [KRaftMigrationDriver id=111] No brokers are > known to KRaft, waiting for brokers to register. > (org.apache.kafka.metadata.migration.KRaftMigrationDriver) > [2025-05-20 15:04:30,778] TRACE [KRaftMigrationDriver id=111] Received > metadata delta, but the controller is not in dual-write mode. Ignoring this > metadata update. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) > [2025-05-20 15:04:30,861] TRACE [KRaftMigrationDriver id=111] Received > metadata delta, but the controller is not in dual-write mode. Ignoring this > metadata update. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) > [2025-05-20 15:04:30,861] INFO [KRaftMigrationDriver id=111] No brokers are > known to KRaft, waiting for brokers to register. > (org.apache.kafka.metadata.migration.KRaftMigrationDriver) > [2025-05-20 15:04:30,905] TRACE [KRaftMigrationDriver id=111] Received > metadata delta, but the controller is not in dual-write mode. Ignoring this > metadata update. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) > [2025-05-20 15:04:30,905] INFO [KRaftMigrationDriver id=111] No brokers are > known to KRaft, waiting for brokers to register. > (org.apache.kafka.metadata.migration.KRaftMigrationDriver) > [2025-05-20 15:04:31,202] TRACE [KRaftMigrationDriver id=111] Received > metadata delta, but the controller is not in dual-write mode. Ignoring this > metadata update. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) > [2025-05-20 15:04:31,202] INFO [KRaftMigrationDriver id=111] No brokers are > known to KRaft, waiting for brokers to register. > (org.apache.kafka.metadata.migration.KRaftMigrationDriver) > [2025-05-20 15:04:31,278] TRACE [KRaftMigrationDriver id=111] Received > metadata delta, but the controller is not in dual-write mode. Ignoring this > metadata update. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) > [2025-05-20 15:04:31,778] TRACE [KRaftMigrationDriver id=111] Received > metadata delta, but the controller is not in dual-write mode. Ignoring this > metadata update. (org.apache.kafka.metadata.migration.KRaftMigrationDriver) > {code} > The config on the controller node is the following > {code:java} > cat /etc/kafka/server.properties > ## > # Ansible managed > #host.name=prodkafkacontr14z4.h.ae-rus.netcluster.name=MrgOne > node.id=114 > controller.rack=z4replica.selector.class=org.apache.kafka.common.replica.RackAwareReplicaSelectorprocess.roles=controllerdelete.topic.enable=True > auto.create.topics.enable=Truelistener.security.protocol.map=INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT,CONTROLLER:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL > listeners=CONTROLLER://:9091advertised.listeners=CONTROLLER://prodkafkacontr14z4.h.ae-rus.net:9091 > controller.listener.names=CONTROLLER > inter.broker.listener.name=EXTERNALcontroller.quorum.voters=1...@prodkafkacontr11z1.h.ae-rus.net:9091,1...@prodkafkacontr12z1.h.ae-rus.net:9091,1...@prodkafkacontr13z1.h.ae-rus.net:9091,1...@prodkafkacontr17z3.h.ae-rus.net:9091,1...@prodkafkacontr18z3.h.ae-rus.net:9091,1...@prodkafkacontr19z3.h.ae-rus.net:9091,1...@prodkafkacontr14z4.h.ae-rus.net:9091,1...@prodkafkacontr15z4.h.ae-rus.net:9091,1...@prodkafkacontr16z4.h.ae-rus.net:9091controller.quorum.bootstrap.servers=prodkafkacontr11z1.h.ae-rus.net:9091,prodkafkacontr12z1.h.ae-rus.net:9091,prodkafkacontr13z1.h.ae-rus.net:9091,prodkafkacontr17z3.h.ae-rus.net:9091,prodkafkacontr18z3.h.ae-rus.net:9091,prodkafkacontr19z3.h.ae-rus.net:9091,prodkafkacontr14z4.h.ae-rus.net:9091,prodkafkacontr15z4.h.ae-rus.net:9091,prodkafkacontr16z4.h.ae-rus.net:9091 > num.network.threads=128 > background.threads=128num.io.threads=128 > socket.send.buffer.bytes=2097152 > socket.receive.buffer.bytes=2097152 > socket.request.max.bytes=104857600 > queued.max.requests=32768offsets.topic.replication.factor=3 > transaction.state.log.replication.factor=3log.dirs=/data/kafka1/kafka > num.recovery.threads.per.data.dir=16log.flush.interval.messages=10240 > log.flush.interval.ms=5000log.retention.hours=48 > log.retention.bytes=1073741824 > log.segment.bytes=1073741824 > log.retention.check.interval.ms=300000group.initial.rebalance.delay.ms=1000auto.leader.rebalance.enable=True > leader.imbalance.check.interval.seconds=300 > leader.imbalance.per.broker.percentage=20default.replication.factor=3 > num.replica.fetchers=64 > replica.fetch.max.bytes=41943040 > replica.fetch.wait.max.ms=1000 > replica.high.watermark.checkpoint.interval.ms=5000 > replica.socket.timeout.ms=30000 > replica.socket.receive.buffer.bytes=65536 > replica.lag.time.max.ms=10000############################# Extra Properties > ############################# > offsets.topic.num.partitions=150 > message.max.bytes=20971520 > batch.size=65536 > zookeeper.connect=prod-zk-11.ae-rus.net:2181,prod-zk-12.ae-rus.net:2181,prod-zk-13.ae-rus.net:2181,prod-zk-14.ae-rus.net:2181,prod-zk-15.ae-rus.net:2181 > rack.aware.assignment.strategy=balance_subtopology > rack.aware.assignment.traffic_cost=org.apache.kafka.streams.processor.internals.assignment.RackAwareTaskAssignor > compression.gzip.level=5 > zookeeper.metadata.migration.enable=True > zookeeper.metadata.migration.min.batch.size=512 > initial.broker.registration.timeout.ms=18000 {code} > The config on the broker node is the following > {code:java} > $ cat /etc/kafka/server.properties > ## > # Ansible managed > #host.name=prodkafka2z1.h.ae-rus.netcluster.name=MrgOne > node.id=2 > broker.rack=z1replica.selector.class=org.apache.kafka.common.replica.RackAwareReplicaSelectorprocess.roles=brokerdelete.topic.enable=True > auto.create.topics.enable=Truelistener.security.protocol.map=INTERNAL:PLAINTEXT,EXTERNAL:PLAINTEXT,CONTROLLER:PLAINTEXT,SSL:SSL,SASL_PLAINTEXT:SASL_PLAINTEXT,SASL_SSL:SASL_SSL > listeners=EXTERNAL://:9092,INTERNAL://:29092advertised.listeners=EXTERNAL://prod-kafka-12.ae-rus.net:9092,INTERNAL://prod-kafka-12.ae-rus.net:29092 > controller.listener.names=CONTROLLER > inter.broker.listener.name=EXTERNALcontroller.quorum.voters=1...@prodkafkacontr11z1.h.ae-rus.net:9091,1...@prodkafkacontr12z1.h.ae-rus.net:9091,1...@prodkafkacontr13z1.h.ae-rus.net:9091,1...@prodkafkacontr17z3.h.ae-rus.net:9091,1...@prodkafkacontr18z3.h.ae-rus.net:9091,1...@prodkafkacontr19z3.h.ae-rus.net:9091,1...@prodkafkacontr14z4.h.ae-rus.net:9091,1...@prodkafkacontr15z4.h.ae-rus.net:9091,1...@prodkafkacontr16z4.h.ae-rus.net:9091controller.quorum.bootstrap.servers=prodkafkacontr11z1.h.ae-rus.net:9091,prodkafkacontr12z1.h.ae-rus.net:9091,prodkafkacontr13z1.h.ae-rus.net:9091,prodkafkacontr17z3.h.ae-rus.net:9091,prodkafkacontr18z3.h.ae-rus.net:9091,prodkafkacontr19z3.h.ae-rus.net:9091,prodkafkacontr14z4.h.ae-rus.net:9091,prodkafkacontr15z4.h.ae-rus.net:9091,prodkafkacontr16z4.h.ae-rus.net:9091 > num.network.threads=128 > background.threads=128num.io.threads=128 > socket.send.buffer.bytes=2097152 > socket.receive.buffer.bytes=2097152 > socket.request.max.bytes=104857600 > queued.max.requests=32768offsets.topic.replication.factor=3 > transaction.state.log.replication.factor=3log.dirs=/data/kafka1/kafka,/data/kafka2/kafka,/data/kafka3/kafka,/data/kafka4/kafka,/data/kafka5/kafka,/data/kafka6/kafka > num.recovery.threads.per.data.dir=16log.flush.interval.messages=10240 > log.flush.interval.ms=5000log.retention.hours=48 > log.retention.bytes=1073741824 > log.segment.bytes=1073741824 > log.retention.check.interval.ms=300000group.initial.rebalance.delay.ms=1000auto.leader.rebalance.enable=True > leader.imbalance.check.interval.seconds=300 > leader.imbalance.per.broker.percentage=20default.replication.factor=3 > num.replica.fetchers=64 > replica.fetch.max.bytes=41943040 > replica.fetch.wait.max.ms=1000 > replica.high.watermark.checkpoint.interval.ms=5000 > replica.socket.timeout.ms=30000 > replica.socket.receive.buffer.bytes=65536 > replica.lag.time.max.ms=10000############################# Extra Properties > ############################# > offsets.topic.num.partitions=150 > message.max.bytes=20971520 > batch.size=65536 > zookeeper.connect=prod-zk-11.ae-rus.net:2181,prod-zk-12.ae-rus.net:2181,prod-zk-13.ae-rus.net:2181,prod-zk-14.ae-rus.net:2181,prod-zk-15.ae-rus.net:2181 > rack.aware.assignment.strategy=balance_subtopology > rack.aware.assignment.traffic_cost=org.apache.kafka.streams.processor.internals.assignment.RackAwareTaskAssignor > compression.gzip.level=5 > zookeeper.metadata.migration.enable=True > zookeeper.metadata.migration.min.batch.size=512 > initial.broker.registration.timeout.ms=18000 > {code} > When trying to move to the next step (`Migrating brokers to KRaft`), it fails > to get controller quorum and crashes. > {code:java} > ==> /var/log/kafka/kafka-server.log <== > [2025-05-20 15:24:07,521] INFO Registered kafka:type=kafka.Log4jController > MBean (kafka.utils.Log4jControllerRegistration$) > [2025-05-20 15:24:07,826] INFO [BrokerServer id=2] Transition from SHUTDOWN > to STARTING (kafka.server.BrokerServer) > [2025-05-20 15:24:07,827] INFO [SharedServer id=2] Starting SharedServer > (kafka.server.SharedServer) > [2025-05-20 15:24:07,884] INFO [LogLoader partition=__cluster_metadata-0, > dir=/data/kafka1/kafka] Recovering unflushed segment 21124. 0/1 recovered for > __cluster_metadata-0. (kafka.log.LogLoader) > [2025-05-20 15:24:07,885] INFO [LogLoader partition=__cluster_metadata-0, > dir=/data/kafka1/kafka] Loading producer state till offset 21124 with message > format version 2 (kafka.log.UnifiedLog$) > [2025-05-20 15:24:07,886] INFO [LogLoader partition=__cluster_metadata-0, > dir=/data/kafka1/kafka] Reloading from producer snapshot and rebuilding > producer state from offset 21124 (kafka.log.UnifiedLog$) > [2025-05-20 15:24:07,893] INFO [LogLoader partition=__cluster_metadata-0, > dir=/data/kafka1/kafka] Producer state recovery took 0ms for snapshot load > and 7ms for segment recovery from offset 21124 (kafka.log.UnifiedLog$) > [2025-05-20 15:24:07,958] INFO [LogLoader partition=__cluster_metadata-0, > dir=/data/kafka1/kafka] Loading producer state till offset 31424 with message > format version 2 (kafka.log.UnifiedLog$) > [2025-05-20 15:24:07,958] INFO [LogLoader partition=__cluster_metadata-0, > dir=/data/kafka1/kafka] Reloading from producer snapshot and rebuilding > producer state from offset 31424 (kafka.log.UnifiedLog$) > [2025-05-20 15:24:07,960] INFO [LogLoader partition=__cluster_metadata-0, > dir=/data/kafka1/kafka] Producer state recovery took 2ms for snapshot load > and 0ms for segment recovery from offset 31424 (kafka.log.UnifiedLog$) > [2025-05-20 15:24:07,995] INFO Initialized snapshots with IDs > SortedSet(OffsetAndEpoch(offset=21124, epoch=7)) from > /data/kafka1/kafka/__cluster_metadata-0 (kafka.raft.KafkaMetadataLog$) > [2025-05-20 15:24:08,023] INFO [raft-expiration-reaper]: Starting > (kafka.raft.TimingWheelExpirationService$ExpiredOperationReaper) > [2025-05-20 15:24:08,301] INFO [BrokerServer id=2] Starting broker > (kafka.server.BrokerServer) > [2025-05-20 15:24:08,318] INFO [broker-2-ThrottledChannelReaper-Fetch]: > Starting (kafka.server.ClientQuotaManager$ThrottledChannelReaper) > [2025-05-20 15:24:08,319] INFO [broker-2-ThrottledChannelReaper-Produce]: > Starting (kafka.server.ClientQuotaManager$ThrottledChannelReaper) > [2025-05-20 15:24:08,320] INFO [broker-2-ThrottledChannelReaper-Request]: > Starting (kafka.server.ClientQuotaManager$ThrottledChannelReaper) > [2025-05-20 15:24:08,322] INFO > [broker-2-ThrottledChannelReaper-ControllerMutation]: Starting > (kafka.server.ClientQuotaManager$ThrottledChannelReaper) > [2025-05-20 15:24:08,365] INFO [BrokerServer id=2] Waiting for controller > quorum voters future (kafka.server.BrokerServer) > [2025-05-20 15:24:08,365] INFO [BrokerServer id=2] Finished waiting for > controller quorum voters future (kafka.server.BrokerServer) > [2025-05-20 15:24:08,371] INFO > [broker-2-to-controller-forwarding-channel-manager]: Starting > (kafka.server.NodeToControllerRequestThread) > [2025-05-20 15:24:08,592] INFO Updated connection-accept-rate max connection > creation rate to 2147483647 (kafka.network.ConnectionQuotas) > [2025-05-20 15:24:08,687] INFO [SocketServer listenerType=BROKER, nodeId=2] > Created data-plane acceptor and processors for endpoint : > ListenerName(EXTERNAL) (kafka.network.SocketServer) > [2025-05-20 15:24:08,688] INFO Updated connection-accept-rate max connection > creation rate to 2147483647 (kafka.network.ConnectionQuotas) > [2025-05-20 15:24:08,759] INFO [SocketServer listenerType=BROKER, nodeId=2] > Created data-plane acceptor and processors for endpoint : > ListenerName(INTERNAL) (kafka.network.SocketServer) > [2025-05-20 15:24:08,765] INFO > [broker-2-to-controller-alter-partition-channel-manager]: Starting > (kafka.server.NodeToControllerRequestThread) > [2025-05-20 15:24:08,771] INFO > [broker-2-to-controller-directory-assignments-channel-manager]: Starting > (kafka.server.NodeToControllerRequestThread) > [2025-05-20 15:24:08,788] INFO [ExpirationReaper-2-Produce]: Starting > (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) > [2025-05-20 15:24:08,788] INFO [ExpirationReaper-2-Fetch]: Starting > (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) > [2025-05-20 15:24:08,789] INFO [ExpirationReaper-2-DeleteRecords]: Starting > (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) > [2025-05-20 15:24:08,790] INFO [ExpirationReaper-2-ElectLeader]: Starting > (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) > [2025-05-20 15:24:08,790] INFO [ExpirationReaper-2-RemoteFetch]: Starting > (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) > [2025-05-20 15:24:08,808] INFO [ExpirationReaper-2-Heartbeat]: Starting > (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) > [2025-05-20 15:24:08,808] INFO [ExpirationReaper-2-Rebalance]: Starting > (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) > [2025-05-20 15:24:08,863] INFO > [broker-2-to-controller-heartbeat-channel-manager]: Starting > (kafka.server.NodeToControllerRequestThread) > [2025-05-20 15:24:08,865] INFO [BrokerLifecycleManager id=2] Incarnation > ZvarJwVqQZOUzzobd91HlQ of broker 2 in cluster 9ca1s2tbTGm1Bi35aynf6Q is now > STARTING. (kafka.server.BrokerLifecycleManager) > [2025-05-20 15:24:08,898] INFO [ExpirationReaper-2-AlterAcls]: Starting > (kafka.server.DelayedOperationPurgatory$ExpiredOperationReaper) > [2025-05-20 15:24:08,978] INFO [BrokerServer id=2] Waiting for the broker > metadata publishers to be installed (kafka.server.BrokerServer) > [2025-05-20 15:24:08,979] INFO [BrokerServer id=2] Finished waiting for the > broker metadata publishers to be installed (kafka.server.BrokerServer) > [2025-05-20 15:24:08,979] INFO [BrokerServer id=2] Waiting for the controller > to acknowledge that we are caught up (kafka.server.BrokerServer) > {code} > {code:java} > ==> /var/log/kafka/kafka-server.log <== > [2025-05-20 15:24:13,380] INFO [BrokerLifecycleManager id=2] Unable to > register the broker because the RPC got timed out before it could be sent. > (kafka.server.BrokerLifecycleManager) > [2025-05-20 15:24:17,990] INFO [BrokerLifecycleManager id=2] Unable to > register the broker because the RPC got timed out before it could be sent. > (kafka.server.BrokerLifecycleManager) > [2025-05-20 15:24:22,701] INFO [BrokerLifecycleManager id=2] Unable to > register the broker because the RPC got timed out before it could be sent. > (kafka.server.BrokerLifecycleManager) > [2025-05-20 15:24:26,864] ERROR [BrokerLifecycleManager id=2] Shutting down > because we were unable to register with the controller quorum. > (kafka.server.BrokerLifecycleManager) > [2025-05-20 15:24:26,864] INFO [BrokerLifecycleManager id=2] Transitioning > from STARTING to SHUTTING_DOWN. (kafka.server.BrokerLifecycleManager) > [2025-05-20 15:24:26,865] INFO > [broker-2-to-controller-heartbeat-channel-manager]: Shutting down > (kafka.server.NodeToControllerRequestThread) > [2025-05-20 15:24:26,865] INFO > [broker-2-to-controller-heartbeat-channel-manager]: Stopped > (kafka.server.NodeToControllerRequestThread) > [2025-05-20 15:24:26,865] INFO > [broker-2-to-controller-heartbeat-channel-manager]: Shutdown completed > (kafka.server.NodeToControllerRequestThread) > [2025-05-20 15:24:26,865] ERROR [BrokerServer id=2] Received a fatal error > while waiting for the controller to acknowledge that we are caught up > (kafka.server.BrokerServer) > java.util.concurrent.CancellationException > at > java.base/java.util.concurrent.CompletableFuture.cancel(CompletableFuture.java:2510) > at > kafka.server.BrokerLifecycleManager$ShutdownEvent.run(BrokerLifecycleManager.scala:638) > at > org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:191) > at java.base/java.lang.Thread.run(Thread.java:1583) > [2025-05-20 15:24:26,867] INFO Node to controller channel manager for > heartbeat shutdown (kafka.server.NodeToControllerChannelManagerImpl) > [2025-05-20 15:24:26,868] INFO [BrokerServer id=2] Transition from STARTING > to STARTED (kafka.server.BrokerServer) > [2025-05-20 15:24:26,870] ERROR [BrokerServer id=2] Fatal error during broker > startup. Prepare to shutdown (kafka.server.BrokerServer) > java.lang.RuntimeException: Received a fatal error while waiting for the > controller to acknowledge that we are caught up > at > org.apache.kafka.server.util.FutureUtils.waitWithLogging(FutureUtils.java:73) > at kafka.server.BrokerServer.startup(BrokerServer.scala:502) > at > kafka.server.KafkaRaftServer.$anonfun$startup$2(KafkaRaftServer.scala:99) > at > kafka.server.KafkaRaftServer.$anonfun$startup$2$adapted(KafkaRaftServer.scala:99) > at scala.Option.foreach(Option.scala:437) > at kafka.server.KafkaRaftServer.startup(KafkaRaftServer.scala:99) > at kafka.Kafka$.main(Kafka.scala:112) > at kafka.Kafka.main(Kafka.scala) > Caused by: java.util.concurrent.CancellationException > at > java.base/java.util.concurrent.CompletableFuture.cancel(CompletableFuture.java:2510) > at > kafka.server.BrokerLifecycleManager$ShutdownEvent.run(BrokerLifecycleManager.scala:638) > at > org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:191) > at java.base/java.lang.Thread.run(Thread.java:1583) > [2025-05-20 15:24:26,870] INFO [BrokerServer id=2] Transition from STARTED to > SHUTTING_DOWN (kafka.server.BrokerServer) > [2025-05-20 15:24:26,871] INFO [BrokerServer id=2] shutting down > (kafka.server.BrokerServer) > [2025-05-20 15:24:26,871] INFO [SocketServer listenerType=BROKER, nodeId=2] > Stopping socket server request processors (kafka.network.SocketServer) > [2025-05-20 15:24:26,874] ERROR Cannot invoke > "java.nio.channels.ServerSocketChannel.close()" because the return value of > "kafka.network.Acceptor.serverChannel()" is null > (kafka.network.DataPlaneAcceptor) > java.lang.NullPointerException: Cannot invoke > "java.nio.channels.ServerSocketChannel.close()" because the return value of > "kafka.network.Acceptor.serverChannel()" is null > at kafka.network.Acceptor.$anonfun$closeAll$2(SocketServer.scala:711) > at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:68) > at kafka.network.Acceptor.closeAll(SocketServer.scala:711) > at kafka.network.Acceptor.close(SocketServer.scala:678) > at > kafka.network.SocketServer.$anonfun$stopProcessingRequests$4(SocketServer.scala:287) > at > kafka.network.SocketServer.$anonfun$stopProcessingRequests$4$adapted(SocketServer.scala:287) > at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:619) > at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:617) > at scala.collection.AbstractIterable.foreach(Iterable.scala:935) > at > kafka.network.SocketServer.stopProcessingRequests(SocketServer.scala:287) > at kafka.server.BrokerServer.$anonfun$shutdown$5(BrokerServer.scala:638) > at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:68) > at kafka.server.BrokerServer.shutdown(BrokerServer.scala:638) > at kafka.server.KafkaBroker.shutdown(KafkaBroker.scala:98) > at kafka.server.KafkaBroker.shutdown$(KafkaBroker.scala:98) > at kafka.server.BrokerServer.shutdown(BrokerServer.scala:67) > at kafka.server.BrokerServer.startup(BrokerServer.scala:555) > at > kafka.server.KafkaRaftServer.$anonfun$startup$2(KafkaRaftServer.scala:99) > at > kafka.server.KafkaRaftServer.$anonfun$startup$2$adapted(KafkaRaftServer.scala:99) > at scala.Option.foreach(Option.scala:437) > at kafka.server.KafkaRaftServer.startup(KafkaRaftServer.scala:99) > at kafka.Kafka$.main(Kafka.scala:112) > at kafka.Kafka.main(Kafka.scala) > [2025-05-20 15:24:26,906] ERROR Cannot invoke > "java.nio.channels.ServerSocketChannel.close()" because the return value of > "kafka.network.Acceptor.serverChannel()" is null > (kafka.network.DataPlaneAcceptor) > java.lang.NullPointerException: Cannot invoke > "java.nio.channels.ServerSocketChannel.close()" because the return value of > "kafka.network.Acceptor.serverChannel()" is null > at kafka.network.Acceptor.$anonfun$closeAll$2(SocketServer.scala:711) > at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:68) > at kafka.network.Acceptor.closeAll(SocketServer.scala:711) > at kafka.network.Acceptor.close(SocketServer.scala:678) > at > kafka.network.SocketServer.$anonfun$stopProcessingRequests$4(SocketServer.scala:287) > at > kafka.network.SocketServer.$anonfun$stopProcessingRequests$4$adapted(SocketServer.scala:287) > at scala.collection.IterableOnceOps.foreach(IterableOnce.scala:619) > at scala.collection.IterableOnceOps.foreach$(IterableOnce.scala:617) > at scala.collection.AbstractIterable.foreach(Iterable.scala:935) > at > kafka.network.SocketServer.stopProcessingRequests(SocketServer.scala:287) > at kafka.server.BrokerServer.$anonfun$shutdown$5(BrokerServer.scala:638) > at kafka.utils.CoreUtils$.swallow(CoreUtils.scala:68) > at kafka.server.BrokerServer.shutdown(BrokerServer.scala:638) > at kafka.server.KafkaBroker.shutdown(KafkaBroker.scala:98) > at kafka.server.KafkaBroker.shutdown$(KafkaBroker.scala:98) > at kafka.server.BrokerServer.shutdown(BrokerServer.scala:67) > at kafka.server.BrokerServer.startup(BrokerServer.scala:555) > at > kafka.server.KafkaRaftServer.$anonfun$startup$2(KafkaRaftServer.scala:99) > at > kafka.server.KafkaRaftServer.$anonfun$startup$2$adapted(KafkaRaftServer.scala:99) > at scala.Option.foreach(Option.scala:437) > at kafka.server.KafkaRaftServer.startup(KafkaRaftServer.scala:99) > at kafka.Kafka$.main(Kafka.scala:112) > at kafka.Kafka.main(Kafka.scala) > [2025-05-20 15:24:26,932] INFO [SocketServer listenerType=BROKER, nodeId=2] > Stopped socket server request processors (kafka.network.SocketServer) {code} > quorum is ok and accessible from brokers > {code:java} > root@prodkafka2z1:/data# /opt/kafka/bin/kafka-metadata-quorum.sh > --bootstrap-controller prodkafkacontr11z1.h.ae-rus.net:9091 describe > --replication > NodeId DirectoryId LogEndOffset Lag LastFetchTimestamp > LastCaughtUpTimestamp Status > 111 L_5_4pNVTlyBl6_go_TkMA 5457 0 1747754839735 > 1747754839735 Leader > 117 uXDetGxGYrzMssuEw-TsKQ 5457 0 1747754839705 > 1747754839705 Observer > 118 EAqK0bysSLhV57oMWfeRbw 5457 0 1747754839705 > 1747754839705 Observer > 112 utKjIX3i9UJU-dZf5e5rfQ 5457 0 1747754839700 > 1747754839700 Observer > 113 gqu36Z80ogEmZ0cflxvAAw 5457 0 1747754839699 > 1747754839699 Observer > 119 UcFbHUd5RGqXoqmsNXYYsA 5457 0 1747754839705 > 1747754839705 Observer > 115 jH9oLOSeGoth0R7w6CPT3A 5457 0 1747754839701 > 1747754839701 Observer > 114 nMGKPes0EA8gB4YS1E7KJw 5457 0 1747754839701 > 1747754839701 Observer > 116 eoiyLM_huf88Ny5ZexL_7w 5457 0 1747754839701 > 1747754839701 Observer > root@prodkafka2z1:/data# /opt/kafka/bin/kafka-metadata-quorum.sh > --bootstrap-controller prodkafkacontr11z1.h.ae-rus.net:9091 describe --status > ClusterId: 9ca1s2tbTGm1Bi35aynf6Q > LeaderId: 111 > LeaderEpoch: 2 > HighWatermark: 5467 > MaxFollowerLag: 0 > MaxFollowerLagTimeMs: 0 > CurrentVoters: [{"id": 111, "directoryId": "L_5_4pNVTlyBl6_go_TkMA", > "endpoints": ["CONTROLLER://prodkafkacontr11z1.h.ae-rus.net:9091"]}] > CurrentObservers: [{"id": 117, "directoryId": > "uXDetGxGYrzMssuEw-TsKQ"}, {"id": 118, "directoryId": > "EAqK0bysSLhV57oMWfeRbw"}, {"id": 112, "directoryId": > "utKjIX3i9UJU-dZf5e5rfQ"}, {"id": 113, "directoryId": > "gqu36Z80ogEmZ0cflxvAAw"}, {"id": 119, "directoryId": > "UcFbHUd5RGqXoqmsNXYYsA"}, {"id": 115, "directoryId": > "jH9oLOSeGoth0R7w6CPT3A"}, {"id": 114, "directoryId": > "nMGKPes0EA8gB4YS1E7KJw"}, {"id": 116, "directoryId": > "eoiyLM_huf88Ny5ZexL_7w"}] {code} -- This message was sent by Atlassian Jira (v8.20.10#820010)