TheR1sing3un commented on code in PR #12971:
URL: https://github.com/apache/hudi/pull/12971#discussion_r1993114417
##########
hudi-common/src/main/java/org/apache/hudi/common/table/read/HoodieBaseFileGroupRecordBuffer.java:
##########
@@ -322,11 +322,13 @@ protected Option<DeleteRecord>
doProcessNextDeletedRecord(DeleteRecord deleteRec
return Option.empty();
}
Comparable deleteOrderingVal = deleteRecord.getOrderingValue();
- // Checks the ordering value does not equal to 0
- // because we use 0 as the default value which means natural order
- boolean chooseExisting = !deleteOrderingVal.equals(0)
- && ReflectionUtils.isSameClass(existingOrderingVal,
deleteOrderingVal)
- && existingOrderingVal.compareTo(deleteOrderingVal) > 0;
+
+ // Because the delete record from older log block, so we should deal
it with three cases:
+ // 1. delete record with invalid ordering value: keep existing
record because it is newer, deletion can be ignored.
+ // 2. delete record with valid ordering value and higher than
existing record: keep delete record because it is newer.
+ // 3. delete record with valid ordering value and lower than
existing record: keep existing record because it is newer, deletion can be
ignored.
+ boolean chooseExisting = deleteOrderingVal.equals(0)
Review Comment:
In addition, there is another question that makes me confused. The concept
of natural order is hidden instead of provided. At present, it is assumed that
the read order of log block is natural order or `commit_time`. Is it necessary
to put `commit_time` directly into HoodieRecord? It looks like we can get this
field out of the data block header.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]