leaves12138 opened a new pull request, #7858: URL: https://github.com/apache/paimon/pull/7858
### Purpose Overwrite commits rebuild the target partition file set before every atomic commit retry. For large partitions this repeatedly performs a full manifest scan after commit conflicts, which can make retries very expensive. ### Changes - Add a stateful overwrite `CommitChangesProvider` in `CommitScanner`. - Keep the first overwrite attempt behavior as a full target scan. - On later retries, update the cached target file set from snapshot `DELTA` manifests between the previous checked snapshot and the latest snapshot. - Use the provider from the common overwrite path, so both normal insert overwrite and drop/truncate partition paths are covered. - Keep index overwrite changes based on the latest index manifest on each retry. - Add a retry test for partial-partition insert overwrite with a concurrent commit between attempts. ### Tests - `mvn -pl paimon-api,paimon-test-utils,paimon-common,paimon-codegen,paimon-codegen-loader,paimon-arrow,paimon-format -DskipTests install` - `mvn -pl paimon-core -DskipITs -Dtest=FileStoreCommitTest#testOverwriteRetryUpdatesCurrentFilesWithDelta test` - `mvn -pl paimon-core -DskipITs -Dtest=FileStoreCommitTest#testOverwritePartialCommit+testDropPartitions+testIndexFiles+testDropStatsForOverwrite+testOverwriteRetryUpdatesCurrentFilesWithDelta test` - `mvn -pl paimon-core -DskipITs -Dtest=FileStoreCommitTest test` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
