kevin-wu24 commented on PR #20859: URL: https://github.com/apache/kafka/pull/20859#issuecomment-3587856896
> What I'm afraid is that we shipped in v4.2 with "the aggressive auto-join semantic even if one node is just removed from the voters set". Then in v4.3, the semantic suddenly changes to "auto-join only when startup". It will confuse users and even surprise users. I think we should decide which semantic we want to deliver first, instead of deliver it then change it. Yeah, that makes sense to me. One thing that confuses me with the proposed semantic from the JIRA is: Why make the proposed semantic idempotent per (`ReplicaKey` + process incarnation) tuple? I would argue that having the node try to auto-join on restart is just as confusing UX. In the original use case, if the node just gets restarted, we end up in the same place of the controller rejoining the voter set when we don't "want" it to (i.e. after an explicit removal). Another approach is we can make auto-join idempotent per `ReplicaKey` and handle this server-side. I think this approach best captures the intent of the feature and maintains a good UX. Basically, when I startup a new observer controller with auto-join, it will join the quorum automatically at most once for the lifetime of its `ReplicaKey`. If the controller's disk fails and it restarts, it will have a new `ReplicaKey`, and will auto-join again. This does not require a KIP, and would be simpler to implement than either of the discussed approaches. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
