rdblue opened a new pull request #1469: URL: https://github.com/apache/iceberg/pull/1469
This adds a new validation to the `RowDelta` operation that validates a set of data files referenced by position deletes still exist. This includes some small cleanup by adding a `validate` method to `SnapshotProducer` to run validations and updating existing operations to use it. The validation for `RowDelta` finds data files that have been deleted since a starting "from" snapshot and validates that the set of referenced data files does not intersect the deleted data file set. Configuration uses two methods: * `RowDelta.validateFromSnapshot(id)` to set the starting snapshot ID for accumulating deleted files * `RowDelta.validateDataFilesExist(Iterable)` to add required data files to the validation This also adds `PositionDeleteWriter.referencedDataFiles` to return data files that should be validated. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
