jsancio opened a new pull request, #16626:
URL: https://github.com/apache/kafka/pull/16626

   This change implements the AddVoter RPC. The high-level algorithm is as 
follow:
   
   1. Check that the leader has fenced the previous leader(s) by checking that 
the HWM is known, otherwise return the REQUEST_TIMED_OUT error.
   2. Check that the cluster supports kraft.version 1, otherwise return the 
UNSUPPORTED_VERSION error.
   3. Check that there are no uncommitted voter changes, otherwise return the 
REQUEST_TIMED_OUT error.
   4. Check that the new voter's id is not part of the existing voter set, 
otherwise return the DUPLICATE_VOTER error.
   5. Send an API_VERSIONS RPC to the first (default) listener to discover the 
supported kraft.version of the new voter.
   6. Check that the new voter supports the current kraft.version, otherwise 
return the INVALID_REQUEST error.
   7. Check that the new voter is caught up to the log end offset of the 
leader, otherwise return a REQUEST_TIMED_OUT error.
   8. Append the updated VotersRecord to the log.
   The KRaft internal listener will read this uncommitted record from the log 
and add the new voter to the set of voters.
   9. Wait for the VotersRecord to commit using the majority of the new set of 
voters. Return a REQUEST_TIMED_OUT error if it doesn't commit in time.
   10. Send the AddVoter successful response to the client.
   
   The leader implements the above algorithm by tracking 3 events: the 
ADD_VOTER request is received, the API_VERSIONS response is received and 
finally the HWM is updated.
   
   The state of the ADD_VOTER operation is tracked by LeaderState using the 
AddVoterHandlerState. The algorithm is implemented by the AddVoterHandler type.
   
   This change also fixes a small issue introduced by the bootstrap checkpoint 
(0-0.checkpoint). The internal partition listener 
(KRaftControlRecordStateMachine) and the external partition listener 
(KafkaRaftClient.ListenerContext) were using "nextOffset = 0" as the initial 
state of the reading cursor. This was causing the bootstrap checkpoint to keep 
getting reloaded until the leader wrote a record to the log. Changing the 
initial offset (nextOffset) to -1 allows the listeners to distinguish between 
the initial state (nextOffset == -1) and the bootstrap checkpoint was loaded 
but the log segment is empty (nextOffset == 0).
   
   ### Committer Checklist (excluded from commit message)
   - [ ] Verify design and implementation 
   - [ ] Verify test coverage and CI build status
   - [ ] Verify documentation (including upgrade notes)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to