cdmikechen commented on a change in pull request #1073: [HUDI-377] Adding
Delete() support to DeltaStreamer
URL: https://github.com/apache/incubator-hudi/pull/1073#discussion_r361347283
##########
File path:
hudi-spark/src/main/java/org/apache/hudi/OverwriteWithLatestAvroPayload.java
##########
@@ -61,8 +60,15 @@ public OverwriteWithLatestAvroPayload
preCombine(OverwriteWithLatestAvroPayload
@Override
public Option<IndexedRecord> combineAndGetUpdateValue(IndexedRecord
currentValue, Schema schema) throws IOException {
+
Review comment:
@vinothchandar
> Doing this in getInsertValue() means even inserts with the flag set will
be deleted.. Not sure if this is intended behavior.. We only want to delete if
updating and marker set?
If this is in a Kaapa architecture, it works. But if this is in a similar
Lambda architecture, data should be rebuilt sometimes, it may will get whole
data change logs by bulk insert.
Of course, this is just my assumption. Maybe our test cases haven't happen
at present. If I think too much, and in fact it can't be found in actual cases,
please ignore my review.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
With regards,
Apache Git Services