danielhumanmod commented on code in PR #312:
URL: https://github.com/apache/polaris/pull/312#discussion_r1841633196
##########
polaris-service/src/main/java/org/apache/polaris/service/task/ManifestFileCleanupTaskHandler.java:
##########
@@ -42,9 +42,11 @@
import org.slf4j.LoggerFactory;
/**
- * {@link TaskHandler} responsible for deleting all of the files in a manifest
and the manifest
- * itself. Since data files may be present in multiple manifests across
different snapshots, we
- * assume a data file that doesn't exist is missing because it was already
deleted by another task.
+ * {@link TaskHandler} responsible for deleting table files: 1. Manifest
files: It contains all the
+ * files in a manifest and the manifest itself. Since data files may be
present in multiple
+ * manifests across different snapshots, we assume a data file that doesn't
exist is missing because
+ * it was already deleted by another task. 2. Table content files: It contains
previous metadata and
+ * statistics files, which are grouped and deleted in batch
Review Comment:
> Need to handle the same for partition statistics files also (in this PR or
as a follow up) more details:
[apache/iceberg#9409](https://github.com/apache/iceberg/pull/9409)
Thanks for your suggestion! Would it be okay to address this in a separate
PR avoid making this one too large? (Left a comment on the code, I can create
an issue to track it if necessary)
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]