Paolo Patierno created KAFKA-19867:
--------------------------------------
Summary: Broker only node sending UpdateVoteRequest when it can't
really become a voter
Key: KAFKA-19867
URL: https://issues.apache.org/jira/browse/KAFKA-19867
Project: Kafka
Issue Type: Bug
Affects Versions: 4.2.0
Reporter: Paolo Patierno
I am in the process of implementing support for controllers scaling within the
Strimzi project (running Apache Kafka on Kubernetes) by also using the Apache
Kafka code in the current "trunk" branch so the future 4.2.0 release because I
want to leverage the auto-join feature.
When scaling down controllers, the auto-join related documentation mentions
that you should first shutdown the controller and later running the
remove-controller (via the kafka-metadata-quorum tool, or programmatically in a
Kubernetes operator case by using the RemoveRaftVoter via the Admin Client
API), otherwise it's pretty clear the node enters in a loop where you remove it
but it rejoins automatically again.
When managing a Kafka cluster running on bare metal/VMs, this approach works
fine even in case the controller scale-down is happening by removing the
controller role from a mixed node (shutdown the node, run kafka-metadata-quorum
tool to remove-controller, restart the node as broker only). But in a
cloud-native environment like Kubernetes, the pod rolling is driven by the
platform so there is no way to run a RemoveRaftVoter admin call in between the
shutdown and restart. For this reason, the remove-controller is done when the
node restarts as broker only.
The issue I am facing is that when such a node restarts as broker only, but
it's still in the quorum voter (because the remove-controller isn't happened
yet), I get the following exception:
{code:java}
2025-10-31 08:01:21 TRACE [kafka-1-raft-io-thread] KafkaRaftClient:2899 -
[RaftManager id=1] Sent outbound request: OutboundRequest(correlationId=13,
data=UpdateRaftVoterRequestData(clusterId='zsn8QaOzTICYZBhUYQpJBg',
currentLeaderEpoch=2, voterId=1, voterDirectoryId=ceZ1jCL9DirrUuCxwsv-jw,
listeners=[], kRaftVersionFeature=KRaftVersionFeature(minSupportedVersion=0,
maxSupportedVersion=1)), createdTimeMs=1761897681990,
destination=my-cluster-broker-0.my-cluster-kafka-brokers.myproject.svc:9090
(id: 0 rack: null isFenced: false))2025-10-31 08:01:21 TRACE
[kafka-1-raft-io-thread] KafkaRaftClient:2830 - [RaftManager id=1] Received
inbound message InboundResponse(correlationId=13,
data=UpdateRaftVoterResponseData(throttleTimeMs=0, errorCode=42,
currentLeader=CurrentLeader(leaderId=0, leaderEpoch=2,
host='my-cluster-broker-0.my-cluster-kafka-brokers.myproject.svc', port=9090)),
source=my-cluster-broker-0.my-cluster-kafka-brokers.myproject.svc:9090 (id: 0
rack: null isFenced: false))2025-10-31 08:01:21 ERROR [kafka-1-raft-io-thread]
ProcessTerminatingFaultHandler:46 - Encountered fatal fault: Unexpected error
in raft IO threadjava.lang.IllegalStateException: Received unexpected invalid
request error at
org.apache.kafka.raft.KafkaRaftClient.maybeHandleCommonResponse(KafkaRaftClient.java:2679)
at
org.apache.kafka.raft.KafkaRaftClient.handleUpdateVoterResponse(KafkaRaftClient.java:2569)
at
org.apache.kafka.raft.KafkaRaftClient.handleResponse(KafkaRaftClient.java:2737)
at
org.apache.kafka.raft.KafkaRaftClient.handleInboundMessage(KafkaRaftClient.java:2836)
at
org.apache.kafka.raft.KafkaRaftClient.poll(KafkaRaftClient.java:3680) at
org.apache.kafka.raft.KafkaRaftClientDriver.doWork(KafkaRaftClientDriver.java:64)
at
org.apache.kafka.server.util.ShutdownableThread.run(ShutdownableThread.java:136)
{code}
It's happening because the node (which is now broker only) is sending a
UpdateRaftVoter request (because it sees itself still in the voters list) even
if it's not actually a controller and, of course, it's not able to handle the
response which is unexpected because it's a broker only node.
I think, despite the remove-controller was not done yet, the broker-only node
should not send such a request even because in any case it's not able to handle
the response so it ends in a "broken" code path.
The code where it's happening is within the
{{KafkaRaftClient.shouldSendUpdateVoteRequest}} where it's not checking the
{{canBecomeVoter}} flag before sending the request (here
https://github.com/apache/kafka/blob/trunk/raft/src/main/java/org/apache/kafka/raft/KafkaRaftClient.java#L3299).
Such a check is available in the {{shouldSendAddOrRemoveVoterRequest}} method
instead (here
https://github.com/apache/kafka/blob/trunk/raft/src/main/java/org/apache/kafka/raft/KafkaRaftClient.java#L3355).
I think that adding the check would fix the issue because actually the node is
not a controller anymore and it can't really become a voter and that flag is,
of course, false avoiding the node to send the UpdateRaftVoter request.
If accepted, I would be willing to open a PR to fix this.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)