[
https://issues.apache.org/jira/browse/KAFKA-16101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Luke Chen resolved KAFKA-16101.
-------------------------------
Resolution: Fixed
> KRaft migration rollback documentation is incorrect
> ---------------------------------------------------
>
> Key: KAFKA-16101
> URL: https://issues.apache.org/jira/browse/KAFKA-16101
> Project: Kafka
> Issue Type: Bug
> Components: kraft
> Affects Versions: 3.6.1
> Reporter: Paolo Patierno
> Assignee: Colin McCabe
> Priority: Blocker
> Fix For: 3.7.0
>
>
> Hello,
> I was trying the KRaft migration rollback procedure locally and I came across
> a potential bug or anyway a situation where the cluster is not
> usable/available for a certain amount of time.
> In order to test the procedure, I start with a one broker (broker ID = 0) and
> one zookeeper node cluster. Then I start the migration with a one KRaft
> controller node (broker ID = 1). The migration runs fine and it reaches the
> point of "dual write" state.
> From this point, I try to run the rollback procedure as described in the
> documentation.
> As first step, this involves ...
> * stopping the broker
> * removing the __cluster_metadata folder
> * removing ZooKeeper migration flag and controller(s) related configuration
> from the broker
> * restarting the broker
> With the above steps done, the broker starts in ZooKeeper mode (no migration,
> no KRaft controllers knowledge) and it keeps logging the following messages
> in DEBUG:
> {code:java}
> [2024-01-08 11:51:20,608] DEBUG
> [zk-broker-0-to-controller-forwarding-channel-manager]: Controller isn't
> cached, looking for local metadata changes
> (kafka.server.BrokerToControllerRequestThread)
> [2024-01-08 11:51:20,608] DEBUG
> [zk-broker-0-to-controller-forwarding-channel-manager]: No controller
> provided, retrying after backoff
> (kafka.server.BrokerToControllerRequestThread)
> [2024-01-08 11:51:20,629] DEBUG
> [zk-broker-0-to-controller-alter-partition-channel-manager]: Controller isn't
> cached, looking for local metadata changes
> (kafka.server.BrokerToControllerRequestThread)
> [2024-01-08 11:51:20,629] DEBUG
> [zk-broker-0-to-controller-alter-partition-channel-manager]: No controller
> provided, retrying after backoff
> (kafka.server.BrokerToControllerRequestThread) {code}
> What's happening should be clear.
> The /controller znode in ZooKeeper still reports the KRaft controller (broker
> ID = 1) as controller. The broker gets it from the znode but doesn't know how
> to reach it.
> The issue is that until the procedure isn't fully completed with the next
> steps (shutting down KRaft controller, deleting /controller znode), the
> cluster is unusable. Any admin or client operation against the broker doesn't
> work, just hangs, the broker doesn't reply.
> Imagining this scenario to a more complex one with 10-20-50 brokers and
> partitions' replicas spread across them, when the brokers are rolled one by
> one (in ZK mode) reporting the above error, the topics will become not
> available one after the other, until all brokers are in such a state and
> nothing can work. This is because from a KRaft controller perspective (still
> running), the brokers are not available anymore and the partitions' replicas
> are out of sync.
> Of course, as soon as you complete the rollback procedure, after deleting the
> /controller znode, the brokers are able to elect a new controller among them
> and everything recovers to work.
> My first question ... isn't the cluster supposed to work during rollback and
> being always available during the rollback when the procedure is not
> completed yet? Or having the cluster not available is an assumption during
> the rollback, until it's fully completed?
> This "unavailability" time window could be reduced by deleting the
> /controller znode before shutting down the KRaft controllers to allow the
> brokers electing a new controller among them, but in this case, could there
> be a race condition where KRaft controllers still running could steal
> leadership again?
> Or is there anything missing in the documentation maybe which is driving to
> this problem?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)