istreeter commented on issue #9831:
URL: https://github.com/apache/hudi/issues/9831#issuecomment-1754600158

   Thank you, the FAQ is helpful for explaining ways to work with the problem.
   
   I think the summary of this conversation is that duplicates happen even if 
`hoodie.datasource.write.operation=upsert`.  Therefore I strongly think [this 
bit of 
documentation](https://hudi.apache.org/docs/concurrency_control/#multi-writer-guarantees)
 needs changing.
   
   Currently it says:
   
   > UPSERT Guarantee: The target table will NEVER show duplicates.
   
   But that is misleading, because UPSERTS can include both INSERTS and 
UPDATES.  So it should say either:
   
   > UPSERT Guarantee: The target table **MIGHT** show duplicates.
   
   or
   
   > **UPDATE** Guarantee: The target table will NEVER show duplicates.
   
   Sorry to keep pushing this point.  But I think this is important, if the 
documentation is going to provide guarantees then the guarantees should be 
worded accurately.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to