kevin-wu24 commented on PR #20859:
URL: https://github.com/apache/kafka/pull/20859#issuecomment-3587856896

   > What I'm afraid is that we shipped in v4.2 with "the aggressive auto-join 
semantic even if one node is just removed from the voters set". Then in v4.3, 
the semantic suddenly changes to "auto-join only when startup". It will confuse 
users and even surprise users. I think we should decide which semantic we want 
to deliver first, instead of deliver it then change it.
   
   Yeah, that makes sense to me. 
   
   One thing that confuses me with the proposed semantic from the JIRA is: 
   
   Why make the proposed semantic idempotent per (`ReplicaKey` + process 
incarnation) tuple? I would argue that having the node try to auto-join on 
restart is just as confusing UX. In the original use case, if the node just 
gets restarted, we end up in the same place of the controller rejoining the 
voter set when we don't "want" it to (i.e. after an explicit removal).
   
   Another approach is we can make auto-join idempotent per `ReplicaKey` and 
handle this server-side. I think this approach best captures the intent of the 
feature and maintains a good UX. Basically, when I startup a new observer 
controller with auto-join, it will join the quorum automatically at most once 
for the lifetime of its `ReplicaKey`. If the controller's disk fails and it 
restarts, it will have a new `ReplicaKey`, and will auto-join again. This does 
not require a KIP, and would be simpler to implement than either of the 
discussed approaches.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to