nsivabalan edited a comment on issue #4318:
URL: https://github.com/apache/hudi/issues/4318#issuecomment-997551175


   I see you are using "INSERT" as the operation type. If your incoming records 
have duplicates, those can reflect as duplicates in hudi as well. only with 
"UPSERT" we de-dup explicitly. for "INSERT", you need to set this config 
https://hudi.apache.org/docs/configurations/#hoodiecombinebeforeinsert to true 
if you want hudi to dedup before ingesting to hudi. 
   
   Can you try setting this and let us know how it goes. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to