danny0405 commented on code in PR #13566:
URL: https://github.com/apache/hudi/pull/13566#discussion_r2211971290
##########
hudi-common/src/main/java/org/apache/hudi/common/table/read/KeyBasedFileGroupRecordBuffer.java:
##########
@@ -133,7 +133,13 @@ public boolean containsLogRecord(String recordKey) {
protected boolean hasNextBaseRecord(T baseRecord) throws IOException {
String recordKey = readerContext.getRecordKey(baseRecord, readerSchema);
- BufferedRecord<T> logRecordInfo = records.remove(recordKey);
+ BufferedRecord<T> logRecordInfo;
+ if (readerContext.getReuseBuffer()) {
+ logRecordInfo = records.get(recordKey);
+ logRecordInfo.setMerged();
Review Comment:
The map could be a spillable map and the value is actually serialized from
disk, just modifying the flag does not work as expected.
My suggestion is we maintain another spillable map for these merged records,
and put them back into the `records` map when initializing for reuse. So that
we can also get rid of these maintainance of these `BufferedRecord.setMerged`
flags and make the code more clear.
Or a special spillalbe map wrapper when reuse is enabled, it holds 2 maps:
one for merged records, one for left logs. When invoking `#Map.remove`,
triggers the backup of the merged records.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]