[ 
https://issues.apache.org/jira/browse/ARROW-15419?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17482674#comment-17482674
 ] 

Weston Pace commented on ARROW-15419:
-------------------------------------

A few things to check:

 * Arrow's S3 implementation will check the usual places (e.g. 
~/.aws/credentials on Linux) to try and automatically detect credentials to use.
 * S3's API will attempt to authenticate with those credentials even if it is 
accessing a public resource.

So you might try removing (or temporarily moving) any credentials files first 
and then try accessing it again.

I've even ran into issues where accessing public data only works if you are 
accessing it anonymously (but I didn't think that was the case for the Ursa 
taxi data).

You might also check any config files to see if there is a default region set.  
The Ursa labs taxi data is in us-east-2.

> [R] Access denied error upon trying to download NYC Taxi data used in the 
> vignette / examples
> ---------------------------------------------------------------------------------------------
>
>                 Key: ARROW-15419
>                 URL: https://issues.apache.org/jira/browse/ARROW-15419
>             Project: Apache Arrow
>          Issue Type: Bug
>          Components: Python, R
>    Affects Versions: 6.0.1
>         Environment: > sessionInfo()
> R version 4.1.2 (2021-11-01)
> Platform: x86_64-apple-darwin17.0 (64-bit)
> Running under: macOS Catalina 10.15.7
> Matrix products: default
> BLAS:   
> /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
> LAPACK: 
> /Library/Frameworks/R.framework/Versions/4.1/Resources/lib/libRlapack.dylib
> locale:
> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
> attached base packages:
> [1] stats     graphics  grDevices utils     datasets  methods  
> [7] base     
> other attached packages:
> [1] arrow_6.0.1
> loaded via a namespace (and not attached):
>  [1] assertthat_0.2.1  R6_2.5.1          sys_3.4          
>  [4] jsonlite_1.7.2    magrittr_2.0.1    credentials_1.3.2
>  [7] cli_3.1.0         rlang_0.4.12      curl_4.3.2       
> [10] vctrs_0.3.8       tools_4.1.2       bit64_4.0.5      
> [13] glue_1.6.0        purrr_0.3.4       bit_4.0.4        
> [16] compiler_4.1.2    askpass_1.1       sessioninfo_1.2.2
> [19] openssl_1.4.6     tidyselect_1.1.1 
>            Reporter: Thomas Sandmann
>            Priority: Minor
>
> I am trying to run the example code from the arrow R package vignette 
> [Working with Arrow Datasets and 
> dplyr|[https://arrow.apache.org/docs/r/articles/dataset.html]
> ]. But data retrieval from S3 fails both using the arrow R package and the 
> pyarrow module (code not shown).
> The first step in the R vignette instructs users to download the NYC dataset 
> from S3 but the {{copy_files }}, but the function returns an *Access Denied* 
> error:
> {{>library(arrow)}}
> {{> arrow::arrow_with_s3()}}
> {{[1] TRUE}}
> {{> arrow::copy_files("s3://ursa-labs-taxi-data", "nyc-taxi")}}
> {{Error: IOError: When listing objects under key '' in bucket 
> 'ursa-labs-taxi-data': AWS Error [code 15]: Access Denied with address : 
> 52.219.100.184 with address : 52.219.100.184}}
>  
> Perhaps the data has moved or the bucket permissions have changed?



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to