Re: Access several s3 buckets, with credentials containing "/"

Steve Loughran Fri, 05 Jun 2015 03:56:41 -0700

> On 5 Jun 2015, at 08:03, Pierre B <pierre.borckm...@realimpactanalytics.com> 
> wrote:
> 
> Hi list!
> 
> My problem is quite simple.
> I need to access several S3 buckets, using different credentials.:
> ```
> val c1 =
> sc.textFile("s3n://[ACCESS_KEY_ID1:SECRET_ACCESS_KEY1]@bucket1/file.csv").count
> val c2 =
> sc.textFile("s3n://[ACCESS_KEY_ID2:SECRET_ACCESS_KEY2]@bucket2/file.csv").count
> val c3 =
> sc.textFile("s3n://[ACCESS_KEY_ID3:SECRET_ACCESS_KEY3]@bucket3/file.csv").count
> ...
> ```
> 
> One/several of those AWS credentials might contain "/" in the private access
> key.
> This is a known problem and from my research, the only ways to deal with
> these "/" are:
> 1/ use environment variables to set the AWS credentials, then access the s3
> buckets without specifying the credentials
> 2/ set the hadoop configuration to contain the the credentials.
> 
> However, none of these solutions allow me to access different buckets, with
> different credentials.
> 
> Can anyone help me on this?
> 
> Thanks
> 
> Pierre


long known outstanding bug in Hadoop s3n, nobody has ever sat down to fix. One 
subtlety is its really hard to test -as you need credentials with a / in. 

The general best practise is recreate your credentials

Now, if you can get the patch to work against hadoop trunk, I promise I will 
commit it
https://issues.apache.org/jira/browse/HADOOP-3733

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Re: Access several s3 buckets, with credentials containing "/"

Reply via email to