aokolnychyi commented on code in PR #4503:
URL: https://github.com/apache/iceberg/pull/4503#discussion_r849867676


##########
api/src/main/java/org/apache/iceberg/actions/DeleteOrphanFiles.java:
##########
@@ -80,6 +80,19 @@
    */
   DeleteOrphanFiles executeDeleteWith(ExecutorService executorService);
 
+  /**
+   * Passes a table which contains the list of actual files in the table. This 
skips the directory listing - any
+   * files in the actualFilesTable provided which are not found in table 
metadata will be deleted. Not compatible
+   * with `location` or `older_than` arguments - this assumes that the 
provided table of actual files has been
+   * filtered down to the table’s location and only includes files older than 
a reasonable retention interval.
+   *
+   * @param tableName the table containing the actual files dataset.  Should 
have a single `file_path` string column
+   * @return this for method chaining
+   */
+  default DeleteOrphanFiles actualFilesTable(String tableName) {

Review Comment:
   @rdblue, you mean exposing this method in `BaseDeleteOrphanFilesSparkAction` 
similarly to `expire` in `BaseExpireSnapshotsSparkAction` or introduce a 
Spark-specific interface?
   
   It is a bummer as this functionality can be useful for all engines.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to