[
https://issues.apache.org/jira/browse/ARROW-13685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17411449#comment-17411449
]
Kyle Hamlin commented on ARROW-13685:
-------------------------------------
I'm also now running into this issue after having the following issue resolved
in 5.0.0: https://issues.apache.org/jira/browse/ARROW-13228
[~westonpace] Curious what the intended resolution is as it's pretty uncommon
to give create bucket permission to a data, ETL, or ML application. I think it
would be preferable to have bucket creation separate from writing to a
bucket/prefix or to have something like `create_bucket=True` as a parameter to
write_dataset.
> [Python] Cannot write dataset to S3FileSystem if bucket already exists
> ----------------------------------------------------------------------
>
> Key: ARROW-13685
> URL: https://issues.apache.org/jira/browse/ARROW-13685
> Project: Apache Arrow
> Issue Type: Bug
> Components: Python
> Affects Versions: 5.0.0
> Reporter: Caleb Overman
> Priority: Major
>
> I'm trying to write a parquet file to an existing S3 bucket using the new
> S3FileSystem interface. However, this is failing with an AWS Access Denied
> error (I do have necessary access). It appears to be trying to recreate the
> bucket which already exists.
> {code:java}
> import numpy as np
> import pyarrow as pa
> from pyarrow import fs
> import pyarrow.dataset as ds
> s3 = fs.S3FileSystem(region="us-west-2")
> table = pa.table({"a": range(10), "b": np.random.randn(10), "c": [1, 2] * 5})
> ds.write_dataset(
> table,
> "my-bucket/test.parquet",
> format="parquet",
> filesystem=s3,
> ){code}
> {code:java}
> OSError: When creating bucket 'my-bucket': AWS Error [code 15]: Access Denied
> {code}
> I'm seeing the same behavior using {{S3FileSystem.create_dir}} when
> {{recursive=True}}.
> {code:java}
> s3.create_dir("my-bucket/test_dir/", recursive=True) # Fails
> s3.create_dir("my-bucket/test_dir/", recursive=False) # Succeeds
> {code}
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)