leaves12138 opened a new pull request, #8233:
URL: https://github.com/apache/paimon/pull/8233

   ### Purpose
   
   Snapshot expiration currently reads manifest data serially in several 
cleanup paths. This can make expiration slow when IO and CPU resources are 
sufficient, especially for tables with many snapshots, manifest lists, and 
index manifests.
   
   ### Changes
   
   - Parallelize reading data manifest entries during data file cleanup while 
preserving the original manifest order for merge/delete decisions.
   - Parallelize reading manifest lists and index manifests across snapshots 
during metadata cleanup, then apply deletion decisions sequentially against the 
shared skipping set.
   - Parallelize building the retained manifest/index skipping set across 
retained/tagged snapshots.
   - Keep existing best-effort behavior for unavailable manifest lists and 
missing index manifests.
   
   ### Tests
   
   - `mvn -N -Pfast-build -DskipTests install`
   - `mvn -pl 
paimon-api,paimon-test-utils,paimon-common,paimon-codegen,paimon-codegen-loader,paimon-arrow,paimon-format
 -Pfast-build -DskipTests install`
   - `mvn -pl paimon-core -Pfast-build -DskipTests compile`
   - `mvn -pl paimon-core -Pfast-build 
-Dtest=FileDeletionTest,ExpireSnapshotsTest,IndexFileExpireTableTest test`
   - `mvn -pl paimon-core -DskipTests validate`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to