danielhumanmod opened a new pull request, #1508:
URL: https://github.com/apache/polaris/pull/1508

   ### Motivation
   
   As a follow-up PR for #312 
   
   Previously, when DROP TABLE PURGE was issued, Polaris cleaned up data files, 
manifest files, and metadata files, but did not clean up partition-level 
statistics files.
   
   ### Current Behavior
   Partition statistics files (partition_stats) remain in storage after the 
table is dropped. These files are listed in the TableMetadata but were not 
included in the batch deletion task, resulting in orphaned files.
   
   ### Changes Introduced
   - Added support for including partitionStatisticsFiles from TableMetadata in 
the batch cleanup task (BatchFileCleanupTaskHandler).
   - Updated `getMetadataFileBatches()` to collect and batch partition 
statistics files for deletion.
   - Added test coverage in `TableCleanupTaskHandlerTest` and 
`BatchFileCleanupTaskHandlerTest` to verify:
     - partitionStats files are scheduled for deletion
     - they are correctly deleted by the task handler
   
   ### Desired Outcome
   After a DROP TABLE PURGE, all Iceberg table metadata including 
partition-level statistics are cleaned up as expected.
   <!--
       Possible security vulnerabilities: STOP here and contact 
[email protected] instead!
   
       Please update the title of the PR with a meaningful message - do not 
leave it "empty" or "generated"
       Please update this summary field:
   
       The summary should cover these topics, if applicable:
       * the motivation for the change
       * a description of the status quo, for example the current behavior
       * the desired behavior
       * etc
   
       PR checklist:
       - Do a self-review of your code before opening a pull request
       - Make sure that there's good test coverage for the changes included in 
this PR
       - Run tests locally before pushing a PR (./gradlew check)
       - Code should have comments where applicable. Particularly 
hard-to-understand
         areas deserve good in-line documentation.
       - Include changes and enhancements to the documentation (in 
site/content/in-dev/unreleased)
       - For Work In Progress Pull Requests, please use the Draft PR feature.
   
       Make sure to add the information BELOW this comment.
       Everything in this comment will NOT be added to the PR description.
   -->
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to