hililiwei commented on issue #5339:
URL: https://github.com/apache/iceberg/issues/5339#issuecomment-1193003590

   Similarly, in Flink, when we write data, we need to find a way to avoid 
double commits.
   We might add a default behavior that does not allow the same file to be 
submitted twice. In addition to checking the file path, we should also check 
the file content, such as verifying the MD5, to ensure that the contents of the 
two files are also consistent.
   If we really want to add duplicate files, we can enforce it by an option 
like 'force', just like what @Spince said.
   Or reverse the logic and allow it by default. This has the advantage of 
preserving compatibility, consistent with our current behavior.
   In conclusion, I believe that it is useful to provide such a mechanism.
   
   Thanks.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to