rdblue commented on a change in pull request #1469:
URL: https://github.com/apache/iceberg/pull/1469#discussion_r499738707
##########
File path: core/src/main/java/org/apache/iceberg/BaseRowDelta.java
##########
@@ -45,4 +62,80 @@ public RowDelta addDeletes(DeleteFile deletes) {
add(deletes);
return this;
}
+
+ @Override
+ public RowDelta validateFromSnapshot(long snapshotId) {
+ this.startingSnapshotId = snapshotId;
+ return this;
+ }
+
+ @Override
+ public RowDelta validateDeletedFiles() {
+ return validateDeletedFiles(true);
+ }
+
+ public RowDelta validateDeletedFiles(boolean shouldValidate) {
+ this.validateDeletes = shouldValidate;
+ return this;
+ }
+
+ @Override
+ public RowDelta validateDataFilesExist(Iterable<? extends CharSequence>
referencedFiles) {
Review comment:
We can support both if we need to later.
My expectation is that we won't have the `DataFile` information to pass back
in many cases. A simple example is `DELETE FROM`. That would be implemented
with a scan that also projects `_file` and `_pos`, then writes the results into
delete files in parallel tasks. It would be some work to pass additional
`DataFile` fields to the writer just so we could pass more information back in
metadata. It could be worth it, but I think it is reasonable to start with the
simpler option.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]