[GitHub] [hudi] am-cpp commented on issue #2992: [SUPPORT] Insert_Override Api not working as expected in Hudi 0.7.0

GitBox Tue, 25 May 2021 03:18:28 -0700


am-cpp commented on issue #2992:
URL: https://github.com/apache/hudi/issues/2992#issuecomment-847745284



   The issue seems to be happening only when the **INSERT_DROP_DUPS_OPT_KEY** 
flag is set to **true**.  Looks like this config is being used for both:
   
   1. Pre-combining: 
[link](https://github.com/apache/hudi/blob/master/hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala#L182)
   2. Deleting records already present in the 
table:[link](https://github.com/apache/hudi/blob/master/hudi-spark-datasource/hudi-spark/src/main/scala/org/apache/hudi/HoodieSparkSqlWriter.scala#L158)
   
   As far as the behavior of the insert overwrite API is concerned it should 
always delete the partition and copy the incoming records. Drop duplicates 
should just pre-combine the input records.
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [hudi] am-cpp commented on issue #2992: [SUPPORT] Insert_Override Api not working as expected in Hudi 0.7.0

Reply via email to