rdblue opened a new pull request #3258:
URL: https://github.com/apache/iceberg/pull/3258


   This fixes CDC validation problems in Flink, #2482. The problem was that 
validation did not configure a starting snapshot ID by calling 
`validateFromSnapshot`. As a result, the entire table history was used to check 
that there were no deletes for the data files referenced by position deletes in 
the commit. One fix would be to correctly set the validation snapshot, rather 
than letting it default to the start of table history. But this PR fixes the 
problem by removing the validation entirely because none is required.
   
   The only position deletes created by the CDC writer are against the data 
files that are being added by the commit. Equality deletes are used to delete 
data in existing files and those require no validation because they do not 
reference specific files and apply to all older data files (retries do not 
affect correctness). Because position deletes will only reference data files 
being added to the table, there is no possibility that those files are 
concurrently deleted.
   
   Closes #2482.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to