Li0k opened a new pull request, #1853: URL: https://github.com/apache/iceberg-rust/pull/1853
## Which issue does this PR close? <!-- We generally require a GitHub issue to be filed for all bug fixes and enhancements and this helps us generate change logs for our releases. You can link an issue to this PR using the GitHub syntax. For example `Closes #123` indicates that this PR will close issue #123. --> - Closes #. ## What changes are included in this PR? ## Summary Refactor `SnapshotProducer` validation methods to use internal state instead of requiring redundant parameters. ## Problem I've noticed that while the current **SnapshotProducer** API design already equips SnapshotProducer with all necessary state, the current invocations still redundantly pass parameters externally. I believe this could lead to some issues. 1. **Data mismatch risk**: Callers could pass different data than what's stored in `SnapshotProducer`, leading to validating one set of files but committing another 2. **API complexity**: As more validations are added (e.g., delete files, file existence checks), each method would require additional parameters, making the API harder to use 3. **Redundant passing**: The same data that was already provided during construction has to be passed again ## Changes - Modified `validate_added_data_files()` and `validate_duplicate_files()` to operate on `self.added_data_files` directly - Updated `FastAppendAction::commit()` to call validation methods without passing `added_data_files` parameter ## Motivation Previously, `added_data_files` was passed as a parameter to validation methods even though it was already stored in `SnapshotProducer`: ```rust // Before snapshot_producer.validate_added_data_files(&self.added_data_files)?; // After snapshot_producer.validate_added_data_files()?; ``` ## Benefits 1. Better encapsulation - validation operates on object's own state 2. Safer API - eliminates possibility of data mismatch 3. Simpler interface - no redundant parameters needed ## Discussion Since **SnapshotProducer** already holds all necessary state, can we further refine validation by performing it during the **new** function's execution to improve data consistency and encapsulation? <!-- Provide a summary of the modifications in this PR. List the main changes such as new features, bug fixes, refactoring, or any other updates. --> ## Are these changes tested? <!-- Specify what test covers (unit test, integration test, etc.). If tests are not included in your PR, please explain why (for example, are they covered by existing tests)? --> -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
