linliu-code commented on code in PR #13115:
URL: https://github.com/apache/hudi/pull/13115#discussion_r2035925762
##########
hudi-common/src/main/java/org/apache/hudi/common/table/read/FileGroupRecordBuffer.java:
##########
@@ -275,43 +274,47 @@ protected Option<Pair<Option<T>, Map<String, Object>>>
doProcessNextDataRecord(T
} else {
switch (recordMergeMode) {
case COMMIT_TIME_ORDERING:
- return Option.empty();
+ return Option.of(Pair.of(Option.ofNullable(record), metadata));
case EVENT_TIME_ORDERING:
- Comparable existingOrderingValue = readerContext.getOrderingValue(
- existingRecordMetadataPair.getLeft(),
existingRecordMetadataPair.getRight(),
- readerSchema, orderingFieldName);
- if
(isDeleteRecordWithNaturalOrder(existingRecordMetadataPair.getLeft(),
existingOrderingValue)) {
- return Option.empty();
- }
- Comparable incomingOrderingValue = readerContext.getOrderingValue(
- Option.of(record), metadata, readerSchema, orderingFieldName);
- if (incomingOrderingValue.compareTo(existingOrderingValue) > 0) {
+ if (shouldKeepNewerRecord(existingRecordMetadataPair.getLeft(),
existingRecordMetadataPair.getRight(), Option.ofNullable(record), metadata)) {
return Option.of(Pair.of(Option.of(record), metadata));
}
return Option.empty();
case CUSTOM:
default:
// Merge and store the combined record
- // Note that the incoming `record` is from an older commit, so it
should be put as
- // the `older` in the merge API
if (payloadClass.isPresent()) {
+ if (existingRecordMetadataPair.getLeft().isEmpty()
Review Comment:
Please note that, it implies a delete is always a commit time based even in
custom merge mode.
This is consistent with commit time merge mode.
I just wonder, should we also do this for event time merge mode, i.e., no
matter the ordering value of delete, we always treat it as a commit time based
delete. CC: @danny0405 , @nsivabalan , @yihua
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]