[ 
https://issues.apache.org/jira/browse/ARROW-13237?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17458330#comment-17458330
 ] 

Antoine Pitrou commented on ARROW-13237:
----------------------------------------

Ok, it seems you can use (on recent AWS SDK versions perhaps?) a pseudo-region 
named "aws-global" that will automatically handle redirects:
{code}
>>> f = fs.S3FileSystem(region='aws-global')
>>> f.get_file_info(fs.FileSelector('ursa-labs-taxi-data'))
[<FileInfo for 'ursa-labs-taxi-data/2009': type=FileType.Directory>,
 <FileInfo for 'ursa-labs-taxi-data/2010': type=FileType.Directory>,
 <FileInfo for 'ursa-labs-taxi-data/2011': type=FileType.Directory>,
 <FileInfo for 'ursa-labs-taxi-data/2012': type=FileType.Directory>,
 <FileInfo for 'ursa-labs-taxi-data/2013': type=FileType.Directory>,
 <FileInfo for 'ursa-labs-taxi-data/2014': type=FileType.Directory>,
 <FileInfo for 'ursa-labs-taxi-data/2015': type=FileType.Directory>,
 <FileInfo for 'ursa-labs-taxi-data/2016': type=FileType.Directory>,
 <FileInfo for 'ursa-labs-taxi-data/2017': type=FileType.Directory>,
 <FileInfo for 'ursa-labs-taxi-data/2018': type=FileType.Directory>,
 <FileInfo for 'ursa-labs-taxi-data/2019': type=FileType.Directory>]
{code}


> [C++] S3 FileSystem doesn't seem to handle redirects
> ----------------------------------------------------
>
>                 Key: ARROW-13237
>                 URL: https://issues.apache.org/jira/browse/ARROW-13237
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: C++, Python
>    Affects Versions: 4.0.1
>            Reporter: Alessandro Molina
>            Priority: Major
>             Fix For: 7.0.0
>
>
> In some conditions AWS S3 seems to respond with a redirect, but Arrow seems 
> to consider it an error instead of following the redirect.
> For example see
> {code}
>     s3, bucket = 
> fs.FileSystem.from_uri("s3://ursa-labs-taxi-data/?region=us-east-1")
>     print(s3.get_file_info(fs.FileSelector(bucket+"/2011", recursive=True)))
> {code}
> The error that you get is
> {code}
>  OSError: When listing objects under key '2011' in bucket 
> 'ursa-labs-taxi-data': AWS Error [code 100]: Unable to parse ExceptionName: 
> PermanentRedirect Message: The bucket you are attempting to access must be 
> addressed using the specified endpoint. Please send all future requests to 
> this endpoint.
> {code}
> It should probably follow the `PermanentRedirect` instead of choking over it
> IT is also possible to reproduce it using
> {code}
>     from pyarrow import fs
>     s3 = fs.SubTreeFileSystem("ursa-labs-taxi-data", fs.S3FileSystem())
>     print(s3.get_file_info(fs.FileSelector("2011", recursive=True)))
> {code}



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to