Hi all,

Any feedback on the update for the auto-join feature?
If no, we'll go ahead to work on this in v4.2.0.

Thank you,
Luke

On Thu, Nov 13, 2025 at 3:56 PM Luke Chen <[email protected]> wrote:

> Hi all,
>
> Currently, we are working on integrating the dynamic voter change feature
> in our downstream project and found an issue about the auto join(
> KAFKA-19850 <https://issues.apache.org/jira/browse/KAFKA-19850>).
>
> The main problem is that when auto.join is enabled, once a voter is
> removed, it'll get auto-joined immediately. We know this limitation, so we
> ask users to "shutdown the node" before doing the voter removal. However,
> this will cause some problems:
>
> 1. Broken quorum: This "shutdown the to-be-removed controller first"
> operation might break the quorum in the worst case. For example, 3
> controller nodes quorum (C1, C2, C3), C1 is the leader, C3 is already
> caught up with C1, C2 is still catching up with the leader. When users want
> to remove C3, following the guide, users shutdown the C3 first. But at this
> point of time, the quorum is broken and the kafka cluster is basically
> unavailable.
>
> 2. Not cloud-native operation: In the cloud environment(k8s), it's not
> possible to shutdown a node and wait for something to be completed and then
> start it up again.
>
> 3. User confusion: If users don't check the doc first and directly do the
> voter removal with auto.join enabled, the removed node will join
> immediately, which confuse users.
>
>
> Currently, we are working on a fix for v4.2.0, here are the thoughts:
> 1. Avoiding to auto-join a removed node into the voters until this node is
> restarted. During this period of time, the node can be added manually.
> 2. Adding a timer (ex: 5 mins) after a node is removed. It will be
> auto-joined after the timeout or node restart. The timeout can be
> configurable in the future release.
>
> Personally, the solution (1) makes more sense in my opinion. The solution
> (2) might also cause unexpected auto-join if the timer is too short.
>
> I also think we should modify the semantics of auto join as "a node will
> be auto-joined only when node startup". This way, we don't have to ask
> users to shutdown the node before doing voter removal. And I also think
> this change can be included in v4.2.0 because we haven't released the auto
> join feature yet.
>
> Do you have any thoughts?
>
>
> Thank you,
> Luke
>
> On Fri, Mar 29, 2024 at 8:58 AM José Armando García Sancio
> <[email protected]> wrote:
>
>> Jun, thanks a lot for your help. I feel that the KIP is much better
>> after your detailed input.
>>
>> If there is no more feedback, I'll start a voting thread tomorrow
>> morning. I'll monitor KIP-1022's discussion thread and update this KIP
>> with anything that affects the KIP's specification.
>>
>> Thanks,
>> --
>> -José
>>
>

Reply via email to