bvaradar commented on issue #1979:
URL: https://github.com/apache/hudi/issues/1979#issuecomment-679522323


   @hughfdjackson : In general getting incremental read to discard duplicates 
is not possible for MOR table types as we defer the merging of records to 
compaction.
   
   I was thinking about alternate ways to achieve your use-case for COW table 
by using an application level boolean flag. Let me know if this makes sense:
   
   1. Introduce additional  boolean column "changed". Default Value is false.
   2. Have your own implementation of HoodieRecordPayload plugged-in.
   3a In HoodieRecordPayload.getInsertValue(), return an avro record with 
changed = true. This function is called first time  when the new record is 
inserted.
   3(b) In HoodieRecordPayload.combineAndGetUpdateValue(), if you determine, 
there is no material change, set changed = false else set it to true.
   
   In your incremental query,  add the filter changed = true to filter out 
those without material changes ? 


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to