rdblue opened a new pull request #1469:
URL: https://github.com/apache/iceberg/pull/1469


   This adds a new validation to the `RowDelta` operation that validates a set 
of data files referenced by position deletes still exist.
   
   This includes some small cleanup by adding a `validate` method to 
`SnapshotProducer` to run validations and updating existing operations to use 
it.
   
   The validation for `RowDelta` finds data files that have been deleted since 
a starting "from" snapshot and validates that the set of referenced data files 
does not intersect the deleted data file set. Configuration uses two methods:
   
   * `RowDelta.validateFromSnapshot(id)` to set the starting snapshot ID for 
accumulating deleted files
   * `RowDelta.validateDataFilesExist(Iterable)` to add required data files to 
the validation
   
   This also adds `PositionDeleteWriter.referencedDataFiles` to return data 
files that should be validated.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to