rdblue commented on a change in pull request #1469:
URL: https://github.com/apache/iceberg/pull/1469#discussion_r499738707



##########
File path: core/src/main/java/org/apache/iceberg/BaseRowDelta.java
##########
@@ -45,4 +62,80 @@ public RowDelta addDeletes(DeleteFile deletes) {
     add(deletes);
     return this;
   }
+
+  @Override
+  public RowDelta validateFromSnapshot(long snapshotId) {
+    this.startingSnapshotId = snapshotId;
+    return this;
+  }
+
+  @Override
+  public RowDelta validateDeletedFiles() {
+    return validateDeletedFiles(true);
+  }
+
+  public RowDelta validateDeletedFiles(boolean shouldValidate) {
+    this.validateDeletes = shouldValidate;
+    return this;
+  }
+
+  @Override
+  public RowDelta validateDataFilesExist(Iterable<? extends CharSequence> 
referencedFiles) {

Review comment:
       We can support both if we need to later.
   
   My expectation is that we won't have the `DataFile` information to pass back 
in many cases. A simple example is `DELETE FROM`. That would be implemented 
with a scan that also projects `_file` and `_pos`, then writes the results into 
delete files in parallel tasks. It would be some work to pass additional 
`DataFile` fields to the writer just so we could pass more information back in 
metadata. It could be worth it, but I think it is reasonable to start with the 
simpler option.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to