shashankhs11 commented on code in PR #20285:
URL: https://github.com/apache/kafka/pull/20285#discussion_r2409140007
##########
clients/src/main/java/org/apache/kafka/clients/producer/internals/ProducerBatch.java:
##########
@@ -237,6 +238,20 @@ public boolean completeExceptionally(
return done(ProduceResponse.INVALID_OFFSET, RecordBatch.NO_TIMESTAMP,
topLevelException, recordExceptions);
}
+ /**
+ * Get all record futures for this batch.
+ * This is used by flush() to wait on individual records rather than the
batch-level future.
+ * When batches are split, individual futures are chained to the new
batches,
+ * ensuring flush() waits for all split batches to complete.
+ *
+ * @return List of FutureRecordMetadata for all records in this batch
+ */
+ public List<FutureRecordMetadata> recordFutures() {
+ return thunks.stream()
+ .map(thunk -> thunk.future)
Review Comment:
> Could you do some perf test to compare the flush() time with and w/o this
PR with say a few thousands pending records?
Added a test in ae54bc2d692f7663242498ef94f014c1837cd798
I ran a performance test with 5000 pending records:
- Without PR (on avg): `0.163 ms`
- With PR (on avg): `1.436 ms`
- Increase: `+1.273 ms`
It seems to be significantly slower :smiling_face_with_tear:
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]