leaves12138 opened a new pull request, #7858:
URL: https://github.com/apache/paimon/pull/7858

   ### Purpose
   
   Overwrite commits rebuild the target partition file set before every atomic 
commit retry. For large partitions this repeatedly performs a full manifest 
scan after commit conflicts, which can make retries very expensive.
   
   ### Changes
   
   - Add a stateful overwrite `CommitChangesProvider` in `CommitScanner`.
   - Keep the first overwrite attempt behavior as a full target scan.
   - On later retries, update the cached target file set from snapshot `DELTA` 
manifests between the previous checked snapshot and the latest snapshot.
   - Use the provider from the common overwrite path, so both normal insert 
overwrite and drop/truncate partition paths are covered.
   - Keep index overwrite changes based on the latest index manifest on each 
retry.
   - Add a retry test for partial-partition insert overwrite with a concurrent 
commit between attempts.
   
   ### Tests
   
   - `mvn -pl 
paimon-api,paimon-test-utils,paimon-common,paimon-codegen,paimon-codegen-loader,paimon-arrow,paimon-format
 -DskipTests install`
   - `mvn -pl paimon-core -DskipITs 
-Dtest=FileStoreCommitTest#testOverwriteRetryUpdatesCurrentFilesWithDelta test`
   - `mvn -pl paimon-core -DskipITs 
-Dtest=FileStoreCommitTest#testOverwritePartialCommit+testDropPartitions+testIndexFiles+testDropStatsForOverwrite+testOverwriteRetryUpdatesCurrentFilesWithDelta
 test`
   - `mvn -pl paimon-core -DskipITs -Dtest=FileStoreCommitTest test`
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to