showuon commented on PR #21025:
URL: https://github.com/apache/kafka/pull/21025#issuecomment-3600448983
@jsancio @kevin-wu24 , I found the current solution has one drawback, which
is that the snapshot only contains `the latest voters at the time the snapshot
taken`. That means we lost the voters change history before the snapshot. This
could cause the issue:
1. controller 1, 2, 3 are the voters
2. controller 3 is removed, so the current voters are [1, 2]
3. controller 3 cannot be auto-joined because the leader controller found 3
has ever been in the voter.
4. The leader controller takes a snapshot for metadata log
5. The leader controller restarts, and load the snapshot + records after
snapshot, which gets the voter history contains only [1, 2]
6. controller 3 restarts, addVoterRequest is sent because auto-join is
enabled, but this time it will succeed because the leader controller doesn't
know controller 3 has ever been in the voters.
My suggestion:
1. Keep the same semantic: `new controllers (node id + directory UUID
tuples) will automatically join the KRaft voter set once if they have not been
a voter before.`, but add a note in doc to notify users to shutdown the removed
controller soon to avoid the node being auto-joined again in some edge cases.
2. Change the semantic to: `Only do the auto-join when a node is start up
first time`. The start up first time is defined by: LEO of metadata_log is 0.
The use case of auto-join is original designed for new formatted controller
node, so it should not be a problem. The drawback of this issue is that the
auto-join can only have one shot if the first auto-join failed and restarted.
For example:
a. controller A is formatted, with auto-join enabled
b. controller A startup, addVoterRequest is sent (We can initialize the
updateVoter timer as expired state, and reset after response received)
c. The leader controller responded with error
d. controller A fetches metadata log from the leader controller (metadata
log LEO > 0)
e. Before controller A retrying the addVoter request, it crashes
f. After restart, controller A will not auto send addVoter request anymore
due to (metadata log LEO > 0)
I think the suggestion (2) is better since it has no edge cases at all. We
just need to make it clear that the auto-join can only happen when a node has
empty metadata log. WDYT?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]