Hi all, Any feedback on the update for the auto-join feature? If no, we'll go ahead to work on this in v4.2.0.
Thank you, Luke On Thu, Nov 13, 2025 at 3:56 PM Luke Chen <[email protected]> wrote: > Hi all, > > Currently, we are working on integrating the dynamic voter change feature > in our downstream project and found an issue about the auto join( > KAFKA-19850 <https://issues.apache.org/jira/browse/KAFKA-19850>). > > The main problem is that when auto.join is enabled, once a voter is > removed, it'll get auto-joined immediately. We know this limitation, so we > ask users to "shutdown the node" before doing the voter removal. However, > this will cause some problems: > > 1. Broken quorum: This "shutdown the to-be-removed controller first" > operation might break the quorum in the worst case. For example, 3 > controller nodes quorum (C1, C2, C3), C1 is the leader, C3 is already > caught up with C1, C2 is still catching up with the leader. When users want > to remove C3, following the guide, users shutdown the C3 first. But at this > point of time, the quorum is broken and the kafka cluster is basically > unavailable. > > 2. Not cloud-native operation: In the cloud environment(k8s), it's not > possible to shutdown a node and wait for something to be completed and then > start it up again. > > 3. User confusion: If users don't check the doc first and directly do the > voter removal with auto.join enabled, the removed node will join > immediately, which confuse users. > > > Currently, we are working on a fix for v4.2.0, here are the thoughts: > 1. Avoiding to auto-join a removed node into the voters until this node is > restarted. During this period of time, the node can be added manually. > 2. Adding a timer (ex: 5 mins) after a node is removed. It will be > auto-joined after the timeout or node restart. The timeout can be > configurable in the future release. > > Personally, the solution (1) makes more sense in my opinion. The solution > (2) might also cause unexpected auto-join if the timer is too short. > > I also think we should modify the semantics of auto join as "a node will > be auto-joined only when node startup". This way, we don't have to ask > users to shutdown the node before doing voter removal. And I also think > this change can be included in v4.2.0 because we haven't released the auto > join feature yet. > > Do you have any thoughts? > > > Thank you, > Luke > > On Fri, Mar 29, 2024 at 8:58 AM José Armando García Sancio > <[email protected]> wrote: > >> Jun, thanks a lot for your help. I feel that the KIP is much better >> after your detailed input. >> >> If there is no more feedback, I'll start a voting thread tomorrow >> morning. I'll monitor KIP-1022's discussion thread and update this KIP >> with anything that affects the KIP's specification. >> >> Thanks, >> -- >> -José >> >
