pvary commented on issue #14425: URL: https://github.com/apache/iceberg/issues/14425#issuecomment-3451322554
> This check leads to data duplication if the previous commit is still being processed in the REST catalog during Flink recovery, but the client has already aborted its POST request and sent a new one for the same data while recovering the Flink job. The new commit should contain the previous version as a base snapshot. The rest catalog should check if the current base snapshot is the same as the expected one. If there is a change on the table then it should fail the second commit. The Flink job should fail, and then it should find the new commit on the table history, so it should skip committing the change -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
