zzshine created KAFKA-19643: ------------------------------- Summary: Controller keeps switching and occasionally goes offline. Key: KAFKA-19643 URL: https://issues.apache.org/jira/browse/KAFKA-19643 Project: Kafka Issue Type: Bug Components: controller, kraft Affects Versions: 3.9.1 Environment: CentOS Linux 7,kernel-release:4.19.325 Java 21 Reporter: zzshine
Inter-cluster communication is normal without packet loss, and the cluster is properly configured. The Kafka server continuously prints the following logs: [2025-08-25 19:08:55,581] INFO [RaftManager id=1] Become candidate due to fetch timeout (org.apache.kafka.raft.KafkaRaftClient) [2025-08-25 19:08:55,686] INFO [RaftManager id=1] Disconnecting from node 2 due to request timeout. (org.apache.kafka.clients.NetworkClient) [2025-08-25 19:08:55,686] INFO [RaftManager id=1] Cancelled in-flight FETCH request with correlation id 128927 due to node 2 being disconnected (elapsed time since creation: 5147ms, elapsed time since send: 5146ms, throttle time: 0ms, request timeout: 5000ms) (org.apache.kafka.clients.NetworkClient) [2025-08-25 19:09:33,274] INFO [NodeToControllerChannelManager id=1 name=heartbeat] Disconnecting from node 3 due to request timeout. (org.apache.kafka.clients.NetworkClient) [2025-08-25 19:09:33,274] INFO [NodeToControllerChannelManager id=1 name=heartbeat] Cancelled in-flight BROKER_HEARTBEAT request with correlation id 871 due to node 3 being disconnected (elapsed time since creation: 4004ms, elapsed time since send: 4004ms, throttle time: 0ms, request timeout: 4000ms) (org.apache.kafka.clients.NetworkClient) [2025-08-25 19:09:33,807] INFO [RaftManager id=1] Disconnecting from node 3 due to request timeout. (org.apache.kafka.clients.NetworkClient) [2025-08-25 19:09:33,807] INFO [RaftManager id=1] Cancelled in-flight FETCH request with correlation id 128995 due to node 3 being disconnected (elapsed time since creation: 5720ms, elapsed time since send: 5720ms, throttle time: 0ms, request timeout: 5000ms) (org.apache.kafka.clients.NetworkClient) Kafka karft config is: # default 2000 broker.heartbeat.interval.ms=4000 # default 9000 broker.session.timeout.ms=10000 # default 2000 controller.quorum.request.timeout.ms=5000 # default 1000 controller.quorum.election.timeout.ms=5000 # default 1000 controller.quorum.election.backoff.max.ms=3000 # default 2000 controller.quorum.fetch.timeout.ms=6000 -- This message was sent by Atlassian Jira (v8.20.10#820010)