[
https://issues.apache.org/jira/browse/KAFKA-19850?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Luke Chen updated KAFKA-19850:
------------------------------
Description:
In v4.2.0, we are able to auto join a controller with the configuration
`controller.quorum.auto.join.enable=true` set
([KIP-853|https://cwiki.apache.org/confluence/display/KAFKA/KIP-853%3A+KRaft+Controller+Membership+Changes#KIP853:KRaftControllerMembershipChanges-Controllerautojoining](KAFKA-19078)).
This is a good improvement for controller addition, but it has a UX issue,
which is that when a controller is removed via removeVoterRequest, it will be
added immediately due to `controller.quorum.auto.join.enable=true`. In the KIP,
we also mention you have to stop the controller before removing the controller:
{noformat}
controller.quorum.auto.join.enable:
Controls whether a KRaft controller should automatically join the cluster
metadata partition for its cluster id. If the configuration is set to
true the controller must be stopped before removing the controller with
kafka-metadata-quorum remove-controller.{noformat}
This "shutdown the to-be-removed controller first" operation might break the
quorum in the worst case. For example, 3 controller nodes quorum (C1, C2, C3),
C1 is the leader, C3 is already caught up with C1, C2 is still catching up with
the leader. When users want to remove C3, following the guide, users shutdown
the C3 first. But at this point of time, the quorum is broken and the kafka
cluster is basically unavailable.
Furthermore, this is not a user friendly behavior. And it will cause many
confusion to users and thought there is something wrong in the controller
removal. Besides, In the kubernetes environment which is controlled by the
operator, it is not the cloud native way to shutdown a node, do some operation,
then start it up.
So, I propose we can improve it by "the removed controller will not be auto
joined before this controller restarted". That is:
1. Once the controller is removed from voters set, it won't be auto joined even
if `controller.quorum.auto.join.enable=true`
2. The controller can be manually join the voters in this state
3. The controller node will be auto join the voters set after node restarted.
So in short, the semantics of auto join is updated as "a node will be
auto-joined only when node startup". I think it makes more sense to users.
Thoughts?
was:
In v4.2.0, we are able to auto join a controller with the configuration
`controller.quorum.auto.join.enable=true` set
([KIP-853|https://cwiki.apache.org/confluence/display/KAFKA/KIP-853%3A+KRaft+Controller+Membership+Changes#KIP853:KRaftControllerMembershipChanges-Controllerautojoining](KAFKA-19078)).
This is a good improvement for controller addition, but it has a UX issue,
which is that when a controller is removed via removeVoterRequest, it will be
added immediately due to `controller.quorum.auto.join.enable=true`. In the KIP,
we also mention you have to stop the controller before removing the controller:
{noformat}
controller.quorum.auto.join.enable:
Controls whether a KRaft controller should automatically join the cluster
metadata partition for its cluster id. If the configuration is set to
true the controller must be stopped before removing the controller with
kafka-metadata-quorum remove-controller.{noformat}
This "shutdown the to-be-removed controller first" operation might break the
quorum in the worst case. For example, 3 controller nodes quorum (C1, C2, C3),
C1 is the leader, C3 is already caught up with C1, C2 is still catching up with
the leader. When users want to remove C3, following the guide, users shutdown
the C3 first. But at this point of time, the quorum is broken and the kafka
cluster is basically unavailable.
Furthermore, this is not a user friendly behavior. And it will cause many
confusion to users and thought there is something wrong in the controller
removal. Besides, In the kubernetes environment which is controlled by the
operator, it is not the cloud native way to shutdown a node, do some operation,
then start it up.
So, I propose we can improve it by "the removed controller will not be auto
joined before this controller restarted". That is:
1. Once the controller is removed from voters set, it won't be auto joined even
if `controller.quorum.auto.join.enable=true`
2. The controller can be manually join the voters in this state
3. The controller node will be auto join the voters set after node restarted.
So basically, the semantics is not changed, it just add some unexpected
remove/add loop. Thoughts?
> KRaft voter auto join will add a removed voter immediately
> ----------------------------------------------------------
>
> Key: KAFKA-19850
> URL: https://issues.apache.org/jira/browse/KAFKA-19850
> Project: Kafka
> Issue Type: Improvement
> Affects Versions: 4.2.0
> Reporter: Luke Chen
> Priority: Major
>
> In v4.2.0, we are able to auto join a controller with the configuration
> `controller.quorum.auto.join.enable=true` set
> ([KIP-853|https://cwiki.apache.org/confluence/display/KAFKA/KIP-853%3A+KRaft+Controller+Membership+Changes#KIP853:KRaftControllerMembershipChanges-Controllerautojoining](KAFKA-19078)).
> This is a good improvement for controller addition, but it has a UX issue,
> which is that when a controller is removed via removeVoterRequest, it will be
> added immediately due to `controller.quorum.auto.join.enable=true`. In the
> KIP, we also mention you have to stop the controller before removing the
> controller:
>
> {noformat}
> controller.quorum.auto.join.enable:
> Controls whether a KRaft controller should automatically join the cluster
> metadata partition for its cluster id. If the configuration is set to
> true the controller must be stopped before removing the controller with
> kafka-metadata-quorum remove-controller.{noformat}
>
> This "shutdown the to-be-removed controller first" operation might break the
> quorum in the worst case. For example, 3 controller nodes quorum (C1, C2,
> C3), C1 is the leader, C3 is already caught up with C1, C2 is still catching
> up with the leader. When users want to remove C3, following the guide, users
> shutdown the C3 first. But at this point of time, the quorum is broken and
> the kafka cluster is basically unavailable.
> Furthermore, this is not a user friendly behavior. And it will cause many
> confusion to users and thought there is something wrong in the controller
> removal. Besides, In the kubernetes environment which is controlled by the
> operator, it is not the cloud native way to shutdown a node, do some
> operation, then start it up.
>
> So, I propose we can improve it by "the removed controller will not be auto
> joined before this controller restarted". That is:
> 1. Once the controller is removed from voters set, it won't be auto joined
> even if `controller.quorum.auto.join.enable=true`
> 2. The controller can be manually join the voters in this state
> 3. The controller node will be auto join the voters set after node restarted.
>
> So in short, the semantics of auto join is updated as "a node will be
> auto-joined only when node startup". I think it makes more sense to users.
> Thoughts?
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)