mitchellciupak opened a new pull request, #2025: URL: https://github.com/apache/iceberg-rust/pull/2025
## Which issue does this PR close? ### Purpose This PR does not close any existing issues. It addresses an optimization opportunity in the fast append workflow. ### Use Case I need to validate data files before committing them to the table. Currently, validate_added_data_files() is called internally during commit(), which means validation occurs on every commit attempt, including retries. ### Enhancement By exposing validate_added_data_files() as a public method, I can perform validation once before attempting a commit. This allows for commit retries without re-running validation, reducing overhead in retry scenarios. This is a performance optimization that provides more control over the validation/commit lifecycle. ## What changes are included in this PR? This commit adds an option to the FastAppendAction to disable the validation step `snapshot_producer.validate_added_data_files()` during commits. This is similar to the option to disable `snapshot_producer.validate_duplicate_files()` - Adds an option/flag to FastAppendAction to perform or disable validation of added data files when appending. - Wiring the option through relevant code paths in `append.rs`. The change is implemented in `crates/iceberg/src/transaction/append.rs`. ## Are these changes tested? These changes have been manually tested outside the test framework. I noticed that the existing `with_check_duplicate()` method also lacks test coverage. If helpful, I can add tests for both `with_check_duplicate()` and the new `validate_added_data_files()` method in this PR. I'm not sure if either change was small enough to be considered out of scope for the project's test strategy. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
