jackye1995 commented on a change in pull request #3207:
URL: https://github.com/apache/iceberg/pull/3207#discussion_r718962376
##########
File path: api/src/main/java/org/apache/iceberg/actions/RewriteDataFiles.java
##########
@@ -76,6 +76,22 @@
*/
String TARGET_FILE_SIZE_BYTES = "target-file-size-bytes";
+ /**
+ * Determines if the data rewrite action should also remove non-global
deletes associated with the data files.
+ * By enabling this option, any data filter specified through {@link
#filter(Expression)} will be converted to
+ * an inclusive partition filter based on all the historical partition specs
of the table.
+ */
+ String REMOVE_PARTITION_DELETES = "remove-partition-deletes";
+ boolean REMOVE_PARTITION_DELETES_DEFAULT = false;
+
+ /**
+ * Determines if the data rewrite action should also remove global deletes.
+ * When enabling this option, specify a data filter would result in {@link
IllegalArgumentException}
+ * because a full table scan planning must be performed to safely remove
global deletes.
Review comment:
Similarly, we block removal of global delete files unless there is no
data filter. Only when we know we have all data files in the scan tasks, we can
safely say the global deletes are merged. This is not the most efficient
approach because maybe we can know a data filter does not impact the ability to
remove a global delete based on some statistics, but I think this is the best
we can do while leveraging the same interface and avoiding the need to do
customized planning.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]