jsancio commented on code in PR #17500: URL: https://github.com/apache/kafka/pull/17500#discussion_r1799965560
########## raft/src/main/java/org/apache/kafka/raft/KafkaRaftClient.java: ########## @@ -258,7 +260,8 @@ public KafkaRaftClient( localSupportedKRaftVersion, logContext, new Random(), - quorumConfig + quorumConfig, + new HashSet<>() Review Comment: KRaft should not track the starting offsets in this way. There are many issues with this approach. The more important ones are: 1. This doesn't track batch start offsets that were added to the log prior to starting the KRaft replica. It also doesn't track batch start offsets that were added by the leader. 2. This doesn't handle log truncation. 3. It is very memory inefficient. It is possible for KRaft to have a lot of small batches and each one of them would have an entry in this set. The log layer already track these offsets. We shouldn't duplicate this information in the KRaft layer. Take a look at the `KafkaMetadataLog#read` method to see how the log layer computes the batch start offset. Specifically, you can take a look at `LogSegment#translateOffset`. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org