rdblue commented on issue #2482:
URL: https://github.com/apache/iceberg/issues/2482#issuecomment-939049135


   There seems to be a lot of confusion around this issue. It was just 
referenced again in [this 
comment](https://github.com/apache/iceberg/pull/3213#discussion_r724779586).
   
   The problem is not the behavior of the validation. That's doing the right 
thing for how it is configured. I think that the problem is that the validation 
is configured to look over the entire table history, which is clearly not 
correct.
   
   The problematic validation is `validateDataFilesExist`. That only needs to 
be used when adding position deletes because equality deletes do not reference 
specific data files. Since position deletes for a CDC stream are only added 
against data files that are being added, I don't think that validation even 
needs to be configured. We can simply remove these two lines: 
https://github.com/apache/iceberg/blob/1cb04128661ea147c2eec4dd1d025698f9604993/flink/src/main/java/org/apache/iceberg/flink/sink/IcebergFilesCommitter.java#L286-L287
   
   @openinx and @stevenzwu, what do you think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to