rdblue commented on issue #2482: URL: https://github.com/apache/iceberg/issues/2482#issuecomment-939049135
There seems to be a lot of confusion around this issue. It was just referenced again in [this comment](https://github.com/apache/iceberg/pull/3213#discussion_r724779586). The problem is not the behavior of the validation. That's doing the right thing for how it is configured. I think that the problem is that the validation is configured to look over the entire table history, which is clearly not correct. The problematic validation is `validateDataFilesExist`. That only needs to be used when adding position deletes because equality deletes do not reference specific data files. Since position deletes for a CDC stream are only added against data files that are being added, I don't think that validation even needs to be configured. We can simply remove these two lines: https://github.com/apache/iceberg/blob/1cb04128661ea147c2eec4dd1d025698f9604993/flink/src/main/java/org/apache/iceberg/flink/sink/IcebergFilesCommitter.java#L286-L287 @openinx and @stevenzwu, what do you think? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
