jsancio commented on PR #16862:
URL: https://github.com/apache/kafka/pull/16862#issuecomment-2287395733

   > @jsancio thank for the patch! I had a few questions after a brief review
   > 
   > If I understand this change, it looks like we are no longer tracking the 
next expected write offset and instead just appending to Raft to learn the last 
offset of the batch.
   
   I added a description to the PR and I think I address the motivation of this 
change there. Let me know if you have any other questions.
   
   > 
   > 1. Is it safe to schedule the write to Raft before the replay calls? I 
think it probably is. These two operations can tasks (writing to log and 
updating in-memory state) can probably be thought of as concurrent. As long as 
the in-memory state creates a snapshot with the correct offset, it should be 
fine. WDYT?
   
   It is not safe to schedule write before replay but this change doesn't do 
that. At a high-level the new controller algorithm for write events is:
   1. generate the cluster metadata records
   2. stage them in the RaftClient so that the offsets are allocated 
(RaftClient#prepareAppend)
   3. replay the cluster metadata records against the active controller state 
machine
   4. tell the RaftClient client that is can write/append the staged batches to 
the log (RaftClient#schedulePreparedAppend)
   
   > 2. Do we need to track nextWriteOffset any more if we aren't using it? We 
the nextWriteOffset to set the broker epoch when registering brokers. I think 
this just needs to be monotonic relative to previous broker epochs. Maybe we 
can remove nextWriteOffset from OffsetControlManager and just track the "last 
written offset" instead?
   
   Isn't that the same thing? Isn't `"last written offset" = "next write 
offset" - 1`?
   
   > 3. In the previous code we tracked the next expected offset and passed it 
down to Raft each time we did an append. AIUI, this sort of acted like an 
optimistic concurrency control. E.g., "only do this operation if the next 
offset is X". We still have the controller epoch which serves a similar 
function, but I'm curious if there are side effects to removing the external 
offset tracking.
   
   Yes. The issue with that solution is that is doesn't allow for the KRaft 
leader to append control records to the log. Appending control records to the 
log would make it so that the controller known offset doesn't match the kraft 
leader's LEO. The KRaft leader needs to append control records to implement 
dynamically adding and removing controller/voters.
   
   Take a look at the description of the PR. I have updated it since you last 
reviewed this PR. I tried to motivate the problem and explain the solution.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to