[
https://issues.apache.org/jira/browse/KAFKA-19867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18036062#comment-18036062
]
Luke Chen edited comment on KAFKA-19867 at 11/7/25 3:40 AM:
------------------------------------------------------------
> My question is ... is it really an observer? If in this timeframe you request
> the quorum status, you will see it as voter and not in the observers list
> until making the remove-controller. Is it acceptable?
You are totally right. This is definitely not a correct state for sure. I think
the right way to do is to fail start the node because it is not in the correct
state.
But as I said earlier, if this is because of the not cloud native friendly
design for the auto.join voter feature, adding this additional check to
workaround the combined-node issue might be a possible temporary option.
However, if there is a user really having a bad setting to accidentally change
the combined-role node into a broker-role only node, there might be no
exception thrown and the quorum is basically running with one voter missing
state.
So, I don't really know what is the right way to go for v4.2.0. Choose the
"right way" to throw the exception as is, or adopt the temporary solution to
workaround the auto.join issue? Maybe the former is better? I'd like to hear
other thoughts on this.
was (Author: showuon):
> My question is ... is it really an observer? If in this timeframe you request
> the quorum status, you will see it as voter and not in the observers list
> until making the remove-controller. Is it acceptable?
You are totally right. This is definitely not a correct state for sure. I think
the right way to do is to fail start the node because it is not in the correct
state.
But as I said earlier, if this is because of the not cloud native friendly
design for the auto.join voter feature, adding this additional check to
workaround the combined-node issue might be a possible temporary option.
However, if there is a user really having a bad setting to accidentally change
the combined-role node into a broker-role only node, there might be no
exception thrown and the quorum is basically running with one voter missing
state.
So, I don't really know what is the right way to go for v4.2.0. Choose the
"right way" to throw the exception as is, or adopt the temporary solution to
workaround the auto.join issue? I'd like to hear other thoughts on this.
> Broker only node sending UpdateVoteRequest when it can't really become a voter
> ------------------------------------------------------------------------------
>
> Key: KAFKA-19867
> URL: https://issues.apache.org/jira/browse/KAFKA-19867
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 4.2.0
> Reporter: Paolo Patierno
> Priority: Major
>
> I am in the process of implementing support for controllers scaling within
> the Strimzi project (running Apache Kafka on Kubernetes) by also using the
> Apache Kafka code in the current "trunk" branch so the future 4.2.0 release
> because I want to leverage the auto-join feature.
> When scaling down controllers, the auto-join related documentation mentions
> that you should first shutdown the controller and later running the
> remove-controller (via the kafka-metadata-quorum tool, or programmatically in
> a Kubernetes operator case by using the RemoveRaftVoter via the Admin Client
> API), otherwise it's pretty clear the node enters in a loop where you remove
> it but it rejoins automatically again.
> When managing a Kafka cluster running on bare metal/VMs, this approach works
> fine even in case the controller scale-down is happening by removing the
> controller role from a mixed node (shutdown the node, run
> kafka-metadata-quorum tool to remove-controller, restart the node as broker
> only). But in a cloud-native environment like Kubernetes, the pod rolling is
> driven by the platform so there is no way to run a RemoveRaftVoter admin call
> in between the shutdown and restart. For this reason, the remove-controller
> is done when the node restarts as broker only.
> The issue I am facing is that when such a node restarts as broker only, but
> it's still in the quorum voter (because the remove-controller isn't happened
> yet), I get the following exception:
>
> {code:java}
> 2025-10-31 08:01:21 TRACE [kafka-1-raft-io-thread] KafkaRaftClient:2899 -
> [RaftManager id=1] Sent outbound request: OutboundRequest(correlationId=13,
> data=UpdateRaftVoterRequestData(clusterId='zsn8QaOzTICYZBhUYQpJBg',
> currentLeaderEpoch=2, voterId=1, voterDirectoryId=ceZ1jCL9DirrUuCxwsv-jw,
> listeners=[], kRaftVersionFeature=KRaftVersionFeature(minSupportedVersion=0,
> maxSupportedVersion=1)), createdTimeMs=1761897681990,
> destination=my-cluster-broker-0.my-cluster-kafka-brokers.myproject.svc:9090
> (id: 0 rack: null isFenced: false))2025-10-31 08:01:21 TRACE
> [kafka-1-raft-io-thread] KafkaRaftClient:2830 - [RaftManager id=1] Received
> inbound message InboundResponse(correlationId=13,
> data=UpdateRaftVoterResponseData(throttleTimeMs=0, errorCode=42,
> currentLeader=CurrentLeader(leaderId=0, leaderEpoch=2,
> host='my-cluster-broker-0.my-cluster-kafka-brokers.myproject.svc',
> port=9090)),
> source=my-cluster-broker-0.my-cluster-kafka-brokers.myproject.svc:9090 (id: 0
> rack: null isFenced: false))2025-10-31 08:01:21 ERROR
> [kafka-1-raft-io-thread] ProcessTerminatingFaultHandler:46 - Encountered
> fatal fault: Unexpected error in raft IO
> threadjava.lang.IllegalStateException: Received unexpected invalid request
> error at
> org.apache.kafka.raft.KafkaRaftClient.maybeHandleCommonResponse(KafkaRaftClient.java:2679)
> at
> org.apache.kafka.raft.KafkaRaftClient.handleUpdateVoterResponse(KafkaRaftClient.java:2569)
> at
> org.apache.kafka.raft.KafkaRaftClient.handleResponse(KafkaRaftClient.java:2737)
> at
> org.apache.kafka.raft.KafkaRaftClient.handleInboundMessage(KafkaRaftClient.java:2836)
> at
> org.apache.kafka.raft.KafkaRaftClient.poll(KafkaRaftClient.java:3680)
> at
> org.apache.kafka.raft.KafkaRaftClientDriver.doWork(KafkaRaftClientDriver.java:64)
> at
> org.apache.kafka.server.util.ShutdownableThread.run(ShutdownableThread.java:136)
> {code}
> It's happening because the node (which is now broker only) is sending a
> UpdateRaftVoter request (because it sees itself still in the voters list)
> even if it's not actually a controller and, of course, it's not able to
> handle the response which is unexpected because it's a broker only node.
> I think, despite the remove-controller was not done yet, the broker-only node
> should not send such a request even because in any case it's not able to
> handle the response so it ends in a "broken" code path.
> The code where it's happening is within the
> {{KafkaRaftClient.shouldSendUpdateVoteRequest}} where it's not checking the
> {{canBecomeVoter}} flag before sending the request (here
> https://github.com/apache/kafka/blob/trunk/raft/src/main/java/org/apache/kafka/raft/KafkaRaftClient.java#L3299).
> Such a check is available in the {{shouldSendAddOrRemoveVoterRequest}}
> method instead (here
> https://github.com/apache/kafka/blob/trunk/raft/src/main/java/org/apache/kafka/raft/KafkaRaftClient.java#L3355).
> I think that adding the check would fix the issue because actually the node
> is not a controller anymore and it can't really become a voter and that flag
> is, of course, false avoiding the node to send the UpdateRaftVoter request.
> If accepted, I would be willing to open a PR to fix this.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)