Tommo56700 opened a new issue, #3008: URL: https://github.com/apache/iceberg-python/issues/3008
### Feature Request / Improvement Hi team, I’ve recently migrated to AWS S3 Tables and switched from using the GlueCatalog to the REST catalog. After updating the catalog configuration, everything works correctly in local, single‑process scenarios. However, I’m encountering intermittent failures when scaling out to multiple Dask workers making parallel requests. Specifically, I’m seeing occasional `ThrottlingException` errors coming from AWS SigV4‑signed requests. Once throttling occurs, subsequent requests sometimes fail with: `requests.exceptions.HTTPError: 403 Client Error` My understanding is that throttled SigV4 signing attempts can lead to follow‑on request failures, resulting in unauthorized S3 operations. According to AWS’s recommendation for handling throttling on signed requests, retry configuration should be applied via botocore: https://boto3.amazonaws.com/v1/documentation/api/latest/guide/retries.html While reviewing the PyIceberg implementation, I noticed: - The GlueCatalog sets reasonable default retry settings on the underlying boto session: https://github.com/apache/iceberg-python/blob/main/pyiceberg/catalog/glue.py#L331-L348 - The REST catalog, and specifically the SigV4Adapter, does not appear to configure any retry behavior by default: https://github.com/apache/iceberg-python/blob/main/pyiceberg/catalog/rest/__init__.py#L684-L694 This creates an inconsistency where switching from Glue to REST results in weaker retry behavior, which becomes visible under parallel load. ### Question / Proposal Should the REST catalog align its default retry behavior with what GlueCatalog already applies? At present, users can manually configure retry settings by supplying a custom botocore session via catalog properties, but it seems reasonable and more consistent for the REST catalog to provide safe defaults, especially since SigV4Adapter is now a common path for AWS S3 Tables. Matching (or at least approaching) the GlueCatalog’s retry policy would provide the following benefits: - Avoid intermittent throttling‑triggered failures in distributed workloads - Improve parity between Glue and REST behavior - Reduce the configuration burden on users switching to REST for AWS‑backed tables Happy to discuss or test any proposed changes. Thanks for your work on the project! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
