rdblue commented on a change in pull request #2865:
URL: https://github.com/apache/iceberg/pull/2865#discussion_r676775636
##########
File path: core/src/main/java/org/apache/iceberg/MergingSnapshotProducer.java
##########
@@ -62,6 +63,9 @@
ImmutableSet.of(DataOperations.OVERWRITE, DataOperations.REPLACE,
DataOperations.DELETE);
private static final Set<String>
VALIDATE_DATA_FILES_EXIST_SKIP_DELETE_OPERATIONS =
ImmutableSet.of(DataOperations.OVERWRITE, DataOperations.REPLACE);
+ // delete files are only added in "overwrite" operations
Review comment:
No. A rewrite may not change the data in the table. As a result, delete
rewrites may rewrite an equality delete file to position deletes, or may
rewrite position deletes in a more efficient file layout. That cannot change
the actual deletes. A compaction operation must apply or carry forward any
deletes to the data files that are rewritten, so it must be aware of any
existing deletes at the start of the operation. That means a compaction already
knows about all of the deletes rewritten by a concurrent delete rewrite, and so
it is safe for the two to happen at the same time.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]