aokolnychyi commented on a change in pull request #2865:
URL: https://github.com/apache/iceberg/pull/2865#discussion_r677032700



##########
File path: core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java
##########
@@ -62,6 +63,9 @@
       ImmutableSet.of(DataOperations.OVERWRITE, DataOperations.REPLACE, 
DataOperations.DELETE);
   private static final Set<String> 
VALIDATE_DATA_FILES_EXIST_SKIP_DELETE_OPERATIONS =
       ImmutableSet.of(DataOperations.OVERWRITE, DataOperations.REPLACE);
+  // delete files are only added in "overwrite" operations
+  private static final Set<String> VALIDATE_REPLACED_DATA_FILES_OPERATIONS =
+      ImmutableSet.of(DataOperations.OVERWRITE);

Review comment:
       I agree with your interpretation of `DataOperations`. If it describes 
what happens to the table from a logical perspective, then whether we use 
delete files or not is indeed an implementation detail. I think this detail may 
be important as we move forward as there will be a bigger focus on the 
performance. I'd opt for having that flag and being able to distinguish such 
cases. If we had that, we would be able to skip more snapshots during the 
validation added in this PR.
   
   But I am sometimes too paranoid about extra work during commits so it may be 
just me.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to