linliu-code commented on code in PR #9593:
URL: https://github.com/apache/hudi/pull/9593#discussion_r1340565776
##########
hudi-client/hudi-client-common/src/main/java/org/apache/hudi/io/HoodieMergeHandle.java:
##########
@@ -270,14 +272,46 @@ protected boolean writeUpdateRecord(HoodieRecord<T>
newRecord, HoodieRecord<T> o
if (combineRecordOpt.isPresent()) {
if (oldRecord.getData() != combineRecordOpt.get().getData()) {
// the incoming record is chosen
- isDelete = HoodieOperation.isDelete(newRecord.getOperation());
+ isDelete = is_delete_record(newRecord, writerSchema,
config.getProps());
} else {
// the incoming record is dropped
return false;
}
updatedRecordsWritten++;
}
- return writeRecord(newRecord, combineRecordOpt, writerSchema,
config.getPayloadConfig().getProps(), isDelete);
+
+ // Do delete since the newRecord is a delete record.
+ if (isDelete) {
+ recordsDeleted++;
+ newRecord.unseal();
+ newRecord.clearNewLocation();
+ newRecord.seal();
+ newRecord.deflate();
+ return true;
+ }
+
+ // Inject custom insert/abort logic.
+ Option<Pair<HoodieRecord, Schema>> processedRecord = recordMerger.merge(
+ Option.empty(), writerSchema, combineRecordOpt, writerSchema,
config.getProps());
+ if (!processedRecord.isPresent()
+ || !is_valid_record(processedRecord.get().getLeft(), writerSchema,
config.getProps())) {
+ return false;
+ }
+
+ // Write the record finally.
+ // TODO: remove delete logic from writeRecord function.
+ return writeRecord(newRecord, Option.of(processedRecord.get().getLeft()),
writerSchema, config.getPayloadConfig().getProps(), isDelete);
+ }
+
+ protected boolean is_delete_record(HoodieRecord record, Schema schema,
TypedProperties props) throws IOException {
+ return record.isDelete(schema, props)
+ || record instanceof HoodieEmptyRecord
+ || (record.getData() != null && record.getData() instanceof
EmptyHoodieRecordPayload)
+ || HoodieOperation.isDelete(record.getOperation());
Review Comment:
Based on my understanding, Hudi uses both empty record and delete record
interchangeably to mean delete. So I combine here. I will modify this PR based
on our discussion so this logic should be gone. But we should find a way to
standardize these logics; otherwise, the scattering logic is really confusing.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]