cdmikechen commented on a change in pull request #1073: [HUDI-377] Adding
Delete() support to DeltaStreamer
URL: https://github.com/apache/incubator-hudi/pull/1073#discussion_r361247665
##########
File path:
hudi-spark/src/main/java/org/apache/hudi/OverwriteWithLatestAvroPayload.java
##########
@@ -61,8 +60,15 @@ public OverwriteWithLatestAvroPayload
preCombine(OverwriteWithLatestAvroPayload
@Override
public Option<IndexedRecord> combineAndGetUpdateValue(IndexedRecord
currentValue, Schema schema) throws IOException {
+
Review comment:
if we change the codes in `combineAndGetUpdateValue()`, but row doesn't have
`_hoodie_delete_marker` column, we need to `Option.of(genericRecord)` twice.
Should we put the codes in `getinsertvalue`? like this:
```java
@Override
public Option<IndexedRecord> combineAndGetUpdateValue(IndexedRecord
currentValue, Schema schema) throws IOException {
// combining strategy here trivially ignores currentValue on disk and
writes this record
return getInsertValue(schema);
}
@Override
public Option<IndexedRecord> getInsertValue(Schema schema) throws
IOException {
GenericRecord baseRecord = HoodieAvroUtils.bytesToAvro(recordBytes,
schema);
Object deleteMarker = baseRecord.get("_hoodie_delete_marker");
if (deleteMarker instanceof Boolean && (boolean) deleteMarker) {
return Option.empty();
}
return Option.of(baseRecord);
}
``
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services