thvasilo opened a new issue, #37260: URL: https://github.com/apache/arrow/issues/37260
### Describe the bug, including details regarding any error messages, version, and platform. We have a process that reads the metadata for multiple files using `joblib`'s `threading` backend. Occasionally we observe the following error: ``` Traceback (most recent call last): [rest of call stack] in get_rows_for_parquet_file nrows = pq.read_metadata( File "/usr/local/lib/python3.9/site-packages/pyarrow/parquet/core.py", line 3481, in read_metadata file_ctx = where = filesystem.open_input_file(where) File "pyarrow/_fs.pyx", line 770, in pyarrow._fs.FileSystem.open_input_file File "pyarrow/error.pxi", line 144, in pyarrow.lib.pyarrow_internal_check_status File "pyarrow/error.pxi", line 115, in pyarrow.lib.check_status OSError: When reading information for key 'path/to/my/file' in bucket 'my-bucket': AWS Error UNKNOWN (HTTP status 503) during HeadObject operation: No response body. ``` I've tried setting the number of retries to 10 when creating the S3Filesystem but we still occasionally get this error. I would expect such an error if the file did not exist, but that's not the case here, so I'm wondering if a different error (like `503 S3 Slow Down`) is being hidden by the system. Any ideas what could be the root case for such an error and if it's possible to surface it? ### Component(s) Python -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
