codope commented on code in PR #9058:
URL: https://github.com/apache/hudi/pull/9058#discussion_r1246592430
##########
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataPayload.java:
##########
@@ -209,9 +211,10 @@ public class HoodieMetadataPayload implements
HoodieRecordPayload<HoodieMetadata
private HoodieMetadataBloomFilter bloomFilterMetadata = null;
private HoodieMetadataColumnStats columnStatMetadata = null;
private HoodieRecordIndexInfo recordIndexMetadata;
+ private boolean isDeletedRecord = false;
- public HoodieMetadataPayload(GenericRecord record, Comparable<?>
orderingVal) {
- this(Option.of(record));
+ public HoodieMetadataPayload(@Nullable GenericRecord record, Comparable<?>
orderingVal) {
+ this(Option.ofNullable(record));
Review Comment:
Where is this constructor used?
##########
hudi-common/src/main/java/org/apache/hudi/metadata/HoodieMetadataPayload.java:
##########
@@ -283,6 +285,8 @@ public HoodieMetadataPayload(Option<GenericRecord>
recordOpt) {
Integer.parseInt(recordIndexRecord.get(RECORD_INDEX_FIELD_FILE_INDEX).toString()),
Long.parseLong(recordIndexRecord.get(RECORD_INDEX_FIELD_INSTANT_TIME).toString()));
}
+ } else {
+ this.isDeletedRecord = true;
Review Comment:
I would favor `isDeleted` field in `HoodieRecordIndexInfo` in the schema.
1. It keeps the schema consistent wrt deletes for different MDT index types.
Let's say some index types have `isDeleted` and some don't, then it's an added
mental burden for developers and also not easy to maintain as we add more
indexes.
2. It gives enough flexibility to have separate delete handling logic for
different index types.
3. Let's consider the semantics of the if-else in the
`HoodieMetadataPayload` constructor. It is based on different index types. By
setting `this.isDeletedRecord = true` in the last else-block we're saying that
for all index types other than the ones above, consider the record to be
deleted. It does not make much sense from the pov of adding more index types in
the future.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]