chia7712 commented on code in PR #18012:
URL: https://github.com/apache/kafka/pull/18012#discussion_r1908110068
##########
storage/src/main/java/org/apache/kafka/storage/internals/log/LogSegment.java:
##########
@@ -257,13 +257,21 @@ public void append(long largestOffset,
if (largestTimestampMs > maxTimestampSoFar()) {
maxTimestampAndOffsetSoFar = new
TimestampOffset(largestTimestampMs, shallowOffsetOfMaxTimestamp);
}
- // append an entry to the index (if needed)
+ // append an entry to the timestamp index at MemoryRecords level
(if needed)
if (bytesSinceLastIndexEntry > indexIntervalBytes) {
- offsetIndex().append(largestOffset, physicalPosition);
timeIndex().maybeAppend(maxTimestampSoFar(),
shallowOffsetOfMaxTimestampSoFar());
- bytesSinceLastIndexEntry = 0;
}
- bytesSinceLastIndexEntry += records.sizeInBytes();
+
+ // append an entry to the offset index at batches level (if needed)
+ for (RecordBatch batch : records.batches()) {
+ if (bytesSinceLastIndexEntry > indexIntervalBytes &&
+ batch.lastOffset() >= offsetIndex().lastOffset()) {
+ offsetIndex().append(batch.lastOffset(), physicalPosition);
Review Comment:
Maybe we can do a bit refactor for this hot method to reduce the access of
buffer and check.
```java
for (RecordBatch batch : records.batches()) {
var batchMaxTimestamp = batch.maxTimestamp();
var batchLastOffset = batch.lastOffset();
var updateTimeIndex = false;
if (batchMaxTimestamp > maxTimestampSoFar()) {
maxTimestampAndOffsetSoFar = new
TimestampOffset(batchMaxTimestamp, batchLastOffset);
updateTimeIndex = true;
}
if (bytesSinceLastIndexEntry > indexIntervalBytes) {
offsetIndex().append(batchLastOffset, physicalPosition);
// max timestamp may not be monotonic, so we need to
check it to avoid the time index append error
if (updateTimeIndex)
timeIndex().maybeAppend(maxTimestampSoFar(),
shallowOffsetOfMaxTimestampSoFar());
bytesSinceLastIndexEntry = 0;
}
var sizeInBytes = batch.sizeInBytes();
physicalPosition += sizeInBytes;
bytesSinceLastIndexEntry += sizeInBytes;
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]