[
https://issues.apache.org/jira/browse/KAFKA-806?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17902264#comment-17902264
]
PoAn Yang commented on KAFKA-806:
---------------------------------
Hi [~chia7712], thanks for the information. I try to add following test to
LogSegmentTest. If there are many batches in a MemoryRecords, the
`LogSegment#read` may cost up to 100ms. I think we can improve this part. May I
take the issue? Thanks.
{code:java}
@Test
public void testIndex() throws IOException {
int recordsInBatch = 100;
int batchInMemoryRecords = 100000;
LogSegment segment = createSegment(0, 1, Time.SYSTEM);
ByteBuffer buffer = ByteBuffer.allocate(recordsInBatch *
batchInMemoryRecords * 100);
for (int j = 0; j < batchInMemoryRecords; j++) {
MemoryRecordsBuilder builder = MemoryRecords.builder(buffer,
Compression.NONE, TimestampType.CREATE_TIME, j * recordsInBatch);
for (int k = 0; k < recordsInBatch; k++) {
builder.append(-1L, "key1".getBytes(), "value1".getBytes());
}
builder.close();
}
buffer.flip();
MemoryRecords record = MemoryRecords.readableRecords(buffer);
segment.append(0L, RecordBatch.NO_TIMESTAMP, -1L, record);
long startMs = System.currentTimeMillis();
segment.read(9999999, 1);
System.out.println("read cost: " + (System.currentTimeMillis() -
startMs) + "ms");
}
{code}
> Index may not always observe log.index.interval.bytes
> -----------------------------------------------------
>
> Key: KAFKA-806
> URL: https://issues.apache.org/jira/browse/KAFKA-806
> Project: Kafka
> Issue Type: Improvement
> Components: log
> Reporter: Jun Rao
> Assignee: Chun-Hao Tang
> Priority: Major
> Labels: newbie++
>
> Currently, each log.append() will add at most 1 index entry, even when the
> appended data is larger than log.index.interval.bytes. One potential issue is
> that if a follower restarts after being down for a long time, it may fetch
> data much bigger than log.index.interval.bytes at a time. This means that
> fewer index entries are created, which can increase the fetch time from the
> consumers.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)