zzshine created KAFKA-19643:
-------------------------------

             Summary: Controller keeps switching and occasionally goes offline.
                 Key: KAFKA-19643
                 URL: https://issues.apache.org/jira/browse/KAFKA-19643
             Project: Kafka
          Issue Type: Bug
          Components: controller, kraft
    Affects Versions: 3.9.1
         Environment: CentOS Linux 7,kernel-release:4.19.325 
Java 21
            Reporter: zzshine


Inter-cluster communication is normal without packet loss, and the cluster is 
properly configured.
The Kafka server continuously prints the following logs:
[2025-08-25 19:08:55,581] INFO [RaftManager id=1] Become candidate due to fetch 
timeout (org.apache.kafka.raft.KafkaRaftClient)
[2025-08-25 19:08:55,686] INFO [RaftManager id=1] Disconnecting from node 2 due 
to request timeout. (org.apache.kafka.clients.NetworkClient)
[2025-08-25 19:08:55,686] INFO [RaftManager id=1] Cancelled in-flight FETCH 
request with correlation id 128927 due to node 2 being disconnected (elapsed 
time since creation: 5147ms, elapsed time since send: 5146ms, throttle time: 
0ms, request timeout: 5000ms) (org.apache.kafka.clients.NetworkClient)
[2025-08-25 19:09:33,274] INFO [NodeToControllerChannelManager id=1 
name=heartbeat] Disconnecting from node 3 due to request timeout. 
(org.apache.kafka.clients.NetworkClient)
[2025-08-25 19:09:33,274] INFO [NodeToControllerChannelManager id=1 
name=heartbeat] Cancelled in-flight BROKER_HEARTBEAT request with correlation 
id 871 due to node 3 being disconnected (elapsed time since creation: 4004ms, 
elapsed time since send: 4004ms, throttle time: 0ms, request timeout: 4000ms) 
(org.apache.kafka.clients.NetworkClient)
[2025-08-25 19:09:33,807] INFO [RaftManager id=1] Disconnecting from node 3 due 
to request timeout. (org.apache.kafka.clients.NetworkClient)
[2025-08-25 19:09:33,807] INFO [RaftManager id=1] Cancelled in-flight FETCH 
request with correlation id 128995 due to node 3 being disconnected (elapsed 
time since creation: 5720ms, elapsed time since send: 5720ms, throttle time: 
0ms, request timeout: 5000ms) (org.apache.kafka.clients.NetworkClient)

Kafka karft config is:
# default 2000
broker.heartbeat.interval.ms=4000
# default 9000
broker.session.timeout.ms=10000
# default 2000
controller.quorum.request.timeout.ms=5000
# default 1000
controller.quorum.election.timeout.ms=5000
# default 1000
controller.quorum.election.backoff.max.ms=3000
# default 2000
controller.quorum.fetch.timeout.ms=6000



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to