Vadym Dytyniak created ARROW-18228:
--------------------------------------
Summary: AWS Error SLOW_DOWN during PutObject operation
Key: ARROW-18228
URL: https://issues.apache.org/jira/browse/ARROW-18228
Project: Apache Arrow
Issue Type: Bug
Affects Versions: 10.0.0
Reporter: Vadym Dytyniak
We use Dask to parallelise read/write operations and pyarrow to write dataset
from worker nodes.
After pyarrow released version 10.0.0, our data flows automatically switched to
the latest version and some of them started to fail with the following error:
{code:java}
File "/usr/local/lib/python3.10/dist-packages/org/store/storage.py", line 768,
in _write_partition
ds.write_dataset(
File "/usr/local/lib/python3.10/dist-packages/pyarrow/dataset.py", line 988,
in write_dataset
_filesystemdataset_write(
File "pyarrow/_dataset.pyx", line 2859, in
pyarrow._dataset._filesystemdataset_write
check_status(CFileSystemDataset.Write(c_options, c_scanner))
File "pyarrow/error.pxi", line 115, in pyarrow.lib.check_status
raise IOError(message)
OSError: When creating key 'equities.us.level2.by_security/' in bucket
'org-prod': AWS Error SLOW_DOWN during PutObject operation: Please reduce your
request rate. {code}
Do you have any idea what was changed for dataset write between 9.0.0 and
10.0.0 to help us to fix the issue?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)