[
https://issues.apache.org/jira/browse/ARROW-13048?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Li updated ARROW-13048:
-----------------------------
Summary: [C++] S3FileSystem fails moving filepaths containing = or + (was:
[Python] S3FileSystem fails moving filepaths containing = or +)
> [C++] S3FileSystem fails moving filepaths containing = or +
> -----------------------------------------------------------
>
> Key: ARROW-13048
> URL: https://issues.apache.org/jira/browse/ARROW-13048
> Project: Apache Arrow
> Issue Type: Bug
> Components: C++, Python
> Affects Versions: 4.0.1
> Reporter: Joerg Schneider
> Assignee: David Li
> Priority: Major
> Labels: filesystem, pull-request-available
> Fix For: 5.0.0
>
> Time Spent: 40m
> Remaining Estimate: 0h
>
> Hi Arrow team,
> we have the very common use-case of having partitioned parquet tables on S3,
> written by Spark. These include equals (=) to denote the partition value per
> folder.
>
> In trying to use PyArrows S3FileSystem `move` function, it's not possible to
> move these objects in the bucket underneath a path which contains `=`
> somewhere:
> {code:java}
> OSError: When copying key
> 'table/date=202007/part-00000-e39069c2-0ea6-4a62-85ea-8011047cd4f4.c000.snappy.parquet'
> in bucket 'bucket' to key
> 'table2/date=202007/part-00000-e39069c2-0ea6-4a62-85ea-8011047cd4f4.c000.snappy.parquet'
> in bucket 'bucket': AWS Error [code 133]: The specified key does not
> exist.{code}
> It is also not possible to move, using preemptively URL-quoted paths, like
> these:
>
> {code:java}
> OSError: When copying key
> 'table/date%3D202007/part-00000-e39069c2-0ea6-4a62-85ea-8011047cd4f4.c000.snappy.parquet'
> in bucket 'bucket' to key
> 'table2/date%3D202007/part-00000-e39069c2-0ea6-4a62-85ea-8011047cd4f4.c000.snappy.parquet'
> in bucket 'bucket': AWS Error [code 133]: The specified key does not
> exist.{code}
>
> The source object does definitely exist, it has in fact been returned by a
> FileSelector from PyArrow itself and is just passed to move.
> Is there any configuration option to be set, or special quoting to be used?
> Thanks in advance.
> Joerg
>
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)