ForeverAngry opened a new pull request, #2205: URL: https://github.com/apache/iceberg-python/pull/2205
## ✨ Improvements & Features This PR introduces an enhancement to the existing `add_files` function here: https://github.com/apache/iceberg-python/blob/dc439402e33d642c9a0c3261f10c078017d6566e/pyiceberg/table/__init__.py#L860: - 🚦 **Retry Logic for Commits**: Added robust retry handling to `add_files` using the `tenacity` library, improving reliability for concurrent writes and large file operations. --- ## 📝 Rationale for this change Concurrent writes using `add_files` often result in commit failures requiring expensive recomputation. By adding retry logic directly to the commit process, we avoid unnecessary rebuilding of `DataFile` objects, improving performance and scalability for large or multi-process operations. --- ## ✅ Are these changes tested? - **Yes.** - Added unit tests for retry logic to ensure correct handling of commit exceptions. --- ## 🚦 Are there any user-facing changes? - **Yes.** - Users have the option to provide tenacity retry kwargs to the `add_files` function arguments. --- **Closes:** #2203 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
