Re: s3 bucket access/read file

2017-05-17 Thread Steve Loughran

On 17 May 2017, at 00:10, jazzed 
mailto:crackshotm...@gmail.com>> wrote:

How did you solve the problem with V4?


which v4 problem? Authentication?

you need to declare the explicit s3a endpoint via fs.s3a.endpoint , otherwise 
you get a generic "bad auth" message which is not a good place to start 
debugging from

full list here: https://hortonworks.github.io/hdp-aws/s3-configure/index.html





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/s3-bucket-access-read-file-tp23536p28688.html
Sent from the Apache Spark User List mailing list archive at 
Nabble.com<http://Nabble.com>.

-
To unsubscribe e-mail: 
user-unsubscr...@spark.apache.org<mailto:user-unsubscr...@spark.apache.org>





Re: s3 bucket access/read file

2017-05-16 Thread jazzed
How did you solve the problem with V4?





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/s3-bucket-access-read-file-tp23536p28688.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe e-mail: user-unsubscr...@spark.apache.org



Re: s3 bucket access/read file

2015-07-01 Thread Aaron Davidson
I think 2.6 failed to abruptly close streams that weren't fully read, which
we observed as a huge performance hit. We had to backport the 2.7
improvements before being able to use it.


Re: s3 bucket access/read file

2015-07-01 Thread Steve Loughran
s3a uses amazon's own libraries; it's tested against frankfurt too.

you have to view s3a support in Hadoop 2.6 as beta-release: it works, with some 
issues. Hadoop 2.7.0+ has it all working now, though are left with the task of 
getting hadoop-aws and the amazon JAR onto your classpath via the --jars 
option, as they aren't in the spark-assembly JAR


On 1 Jul 2015, at 04:46, Aaron Davidson 
mailto:ilike...@gmail.com>> wrote:

Should be able to use s3a (on new hadoop versions), I believe that will try or 
at least has a setting for v4

On Tue, Jun 30, 2015 at 8:31 PM, Exie 
mailto:tfind...@prodevelop.com.au>> wrote:
Not sure if this helps, but the options I set are slightly different:

val hadoopConf=sc.hadoopConfiguration
hadoopConf.set("fs.s3n.awsAccessKeyId","key")
hadoopConf.set("fs.s3n.awsSecretAccessKey","secret")

Try setting them to s3n as opposed to just s3

Good luck!



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/s3-bucket-access-read-file-tp23536p23560.html
Sent from the Apache Spark User List mailing list archive at 
Nabble.com<http://Nabble.com>.

-
To unsubscribe, e-mail: 
user-unsubscr...@spark.apache.org<mailto:user-unsubscr...@spark.apache.org>
For additional commands, e-mail: 
user-h...@spark.apache.org<mailto:user-h...@spark.apache.org>





Re: s3 bucket access/read file

2015-06-30 Thread Aaron Davidson
Should be able to use s3a (on new hadoop versions), I believe that will try
or at least has a setting for v4

On Tue, Jun 30, 2015 at 8:31 PM, Exie  wrote:

> Not sure if this helps, but the options I set are slightly different:
>
> val hadoopConf=sc.hadoopConfiguration
> hadoopConf.set("fs.s3n.awsAccessKeyId","key")
> hadoopConf.set("fs.s3n.awsSecretAccessKey","secret")
>
> Try setting them to s3n as opposed to just s3
>
> Good luck!
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/s3-bucket-access-read-file-tp23536p23560.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


Re: s3 bucket access/read file

2015-06-30 Thread Exie
Not sure if this helps, but the options I set are slightly different:

val hadoopConf=sc.hadoopConfiguration
hadoopConf.set("fs.s3n.awsAccessKeyId","key")
hadoopConf.set("fs.s3n.awsSecretAccessKey","secret")

Try setting them to s3n as opposed to just s3

Good luck!



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/s3-bucket-access-read-file-tp23536p23560.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: s3 bucket access/read file

2015-06-30 Thread didi
We finally managed to find the problem, the s3 files were located in
Frankfurt which only supports the *v4* signature

*Surprising* is the fact that the spark core library method textfile does
not support that!!



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/s3-bucket-access-read-file-tp23536p23544.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: s3 bucket access/read file

2015-06-30 Thread Akhil Das
Try this way:

val data = sc.textFile("s3n://ACCESS_KEY:SECRET_KEY@mybucket/temp/")


Thanks
Best Regards

On Mon, Jun 29, 2015 at 11:59 PM, didi  wrote:

> Hi
>
> *Cant read text file from s3 to create RDD
> *
>
> after setting the configuration
> val hadoopConf=sparkContext.hadoopConfiguration;
> hadoopConf.set("fs.s3.impl",
> "org.apache.hadoop.fs.s3native.NativeS3FileSystem")
> hadoopConf.set("fs.s3.awsAccessKeyId",yourAccessKey)
> hadoopConf.set("fs.s3.awsSecretAccessKey",yourSecretKey)
>
> 1. running the following
> val hfile = sc.textFile("s3n://mybucket/temp/")
>
> I get the error
>
> Exception in thread "main" org.apache.hadoop.fs.s3.S3Exception:
> org.jets3t.service.S3ServiceException: S3 HEAD request failed for '/temp' -
> ResponseCode=400, ResponseMessage=Bad Request
>
> 2. running the following
> val hfile = sc.textFile("s3n://mybucket/*.txt")
>
> I get the error
>
> Exception in thread "main" org.apache.hadoop.fs.s3.S3Exception:
> org.jets3t.service.S3ServiceException: S3 GET failed for '/' XML Error
> Message:  encoding="UTF-8"?>InvalidRequest*The
> authorization mechanism you have provided is not supported. Please use
>
> AWS4-HMAC-SHA256*.C2174C316DEC91CB3oPZfZoPZUbvzXJdVaUGl9N0oI1buMx+A/wJiisx7uZ0bpnTkwsaT6i0fhYhjY97JDWBX1x/2Y8=
>
> I read it has to do something with the v4 signature??? isn't it supported
> by
> the sdk??
>
> 3. running the following
>
> val hfile = sc.textFile("s3n://mybucket")
>
> get the error
> Exception in thread "main" org.apache.hadoop.fs.s3.S3Exception:
> org.jets3t.service.S3ServiceException: S3 HEAD request failed for
> '/user%2Fdidi' - ResponseCode=400, ResponseMessage=Bad Request
>
> what does the user has to do here??? i am using key & secret !!!
>
> How can i simply create RDD from text file on S3
>
> Thanks
>
> Didi
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/s3-bucket-access-read-file-tp23536.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


Re: s3 bucket access/read file

2015-06-29 Thread spark user
Pls check your ACL properties. 


 On Monday, June 29, 2015 11:29 AM, didi  wrote:
   

 Hi

*Cant read text file from s3 to create RDD
*

after setting the configuration
val hadoopConf=sparkContext.hadoopConfiguration;
hadoopConf.set("fs.s3.impl",
"org.apache.hadoop.fs.s3native.NativeS3FileSystem")
hadoopConf.set("fs.s3.awsAccessKeyId",yourAccessKey)
hadoopConf.set("fs.s3.awsSecretAccessKey",yourSecretKey)

1. running the following
val hfile = sc.textFile("s3n://mybucket/temp/")

I get the error

Exception in thread "main" org.apache.hadoop.fs.s3.S3Exception:
org.jets3t.service.S3ServiceException: S3 HEAD request failed for '/temp' -
ResponseCode=400, ResponseMessage=Bad Request

2. running the following
val hfile = sc.textFile("s3n://mybucket/*.txt")

I get the error

Exception in thread "main" org.apache.hadoop.fs.s3.S3Exception:
org.jets3t.service.S3ServiceException: S3 GET failed for '/' XML Error
Message: InvalidRequest*The
authorization mechanism you have provided is not supported. Please use
AWS4-HMAC-SHA256*.C2174C316DEC91CB3oPZfZoPZUbvzXJdVaUGl9N0oI1buMx+A/wJiisx7uZ0bpnTkwsaT6i0fhYhjY97JDWBX1x/2Y8=

I read it has to do something with the v4 signature??? isn't it supported by
the sdk??

3. running the following

val hfile = sc.textFile("s3n://mybucket")

get the error
Exception in thread "main" org.apache.hadoop.fs.s3.S3Exception:
org.jets3t.service.S3ServiceException: S3 HEAD request failed for
'/user%2Fdidi' - ResponseCode=400, ResponseMessage=Bad Request

what does the user has to do here??? i am using key & secret !!!

How can i simply create RDD from text file on S3

Thanks

Didi




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/s3-bucket-access-read-file-tp23536.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



   

s3 bucket access/read file

2015-06-29 Thread didi
Hi

*Cant read text file from s3 to create RDD
*

after setting the configuration
val hadoopConf=sparkContext.hadoopConfiguration;
hadoopConf.set("fs.s3.impl",
"org.apache.hadoop.fs.s3native.NativeS3FileSystem")
hadoopConf.set("fs.s3.awsAccessKeyId",yourAccessKey)
hadoopConf.set("fs.s3.awsSecretAccessKey",yourSecretKey)

1. running the following
val hfile = sc.textFile("s3n://mybucket/temp/")

I get the error

Exception in thread "main" org.apache.hadoop.fs.s3.S3Exception:
org.jets3t.service.S3ServiceException: S3 HEAD request failed for '/temp' -
ResponseCode=400, ResponseMessage=Bad Request

2. running the following
val hfile = sc.textFile("s3n://mybucket/*.txt")

I get the error

Exception in thread "main" org.apache.hadoop.fs.s3.S3Exception:
org.jets3t.service.S3ServiceException: S3 GET failed for '/' XML Error
Message: InvalidRequest*The
authorization mechanism you have provided is not supported. Please use
AWS4-HMAC-SHA256*.C2174C316DEC91CB3oPZfZoPZUbvzXJdVaUGl9N0oI1buMx+A/wJiisx7uZ0bpnTkwsaT6i0fhYhjY97JDWBX1x/2Y8=

I read it has to do something with the v4 signature??? isn't it supported by
the sdk??

3. running the following

val hfile = sc.textFile("s3n://mybucket")

get the error
Exception in thread "main" org.apache.hadoop.fs.s3.S3Exception:
org.jets3t.service.S3ServiceException: S3 HEAD request failed for
'/user%2Fdidi' - ResponseCode=400, ResponseMessage=Bad Request

what does the user has to do here??? i am using key & secret !!!

How can i simply create RDD from text file on S3

Thanks

Didi




--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/s3-bucket-access-read-file-tp23536.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: S3 Bucket Access

2015-01-20 Thread bbailey
Hi sranga,

Were you ever able to get authentication working with the temporary IAM
credentials (id, secret, & token)? I am in the same situation and it would
be great if we could document a solution so others can benefit from this 

Thanks!


sranga wrote
> Thanks Rishi. That is exactly what I am trying to do now :)
> 
> On Tue, Oct 14, 2014 at 2:41 PM, Rishi Pidva <

> rpidva@

> > wrote:
> 
>>
>> As per EMR documentation:
>> http://docs.amazonaws.cn/en_us/ElasticMapReduce/latest/DeveloperGuide/emr-iam-roles.html
>> Access AWS Resources Using IAM Roles
>>
>> If you've launched your cluster with an IAM role, applications running on
>> the EC2 instances of that cluster can use the IAM role to obtain
>> temporary
>> account credentials to use when calling services in AWS.
>>
>> The version of Hadoop available on AMI 2.3.0 and later has already been
>> updated to make use of IAM roles. If your application runs strictly on
>> top
>> of the Hadoop architecture, and does not directly call any service in
>> AWS,
>> it should work with IAM roles with no modification.
>>
>> If your application calls services in AWS directly, you'll need to update
>> it to take advantage of IAM roles. This means that instead of obtaining
>> account credentials from/home/hadoop/conf/core-site.xml on the EC2
>> instances in the cluster, your application will now either use an SDK to
>> access the resources using IAM roles, or call the EC2 instance metadata
>> to
>> obtain the temporary credentials.
>> --
>>
>> Maybe you can use AWS SDK in your application to provide AWS credentials?
>>
>> https://github.com/seratch/AWScala
>>
>>
>> On Oct 14, 2014, at 11:10 AM, Ranga <

> sranga@

> > wrote:
>>





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/S3-Bucket-Access-tp16303p21273.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: S3 Bucket Access

2014-10-14 Thread Ranga
t;>> >> Akhil Das wrote:
>>> >>
>>> >> Try the following:
>>> >>
>>> >> 1. Set the access key and secret key in the sparkContext:
>>> >>
>>> >> sparkContext.set("
>>> >>> ​
>>> >>> AWS_ACCESS_KEY_ID",yourAccessKey)
>>> >>
>>> >> sparkContext.set("
>>> >>> ​
>>> >>> AWS_SECRET_ACCESS_KEY",yourSecretKey)
>>> >>
>>> >>
>>> >> 2. Set the access key and secret key in the environment before
>>> starting
>>> >> your application:
>>> >>
>>> >> ​
>>> >>>
>>> >> export
>>> >>> ​​
>>> >>> AWS_ACCESS_KEY_ID=
>>> > 
>>> >>
>>> >> export
>>> >>> ​​
>>> >>> AWS_SECRET_ACCESS_KEY=
>>> > 
>>> > ​
>>> >>
>>> >>
>>> >> 3. Set the access key and secret key inside the hadoop configurations
>>> >>
>>> >> val hadoopConf=sparkContext.hadoopConfiguration;
>>> >>>
>>> >>> hadoopConf.set("fs.s3.impl",
>>> >>>> "org.apache.hadoop.fs.s3native.NativeS3FileSystem")
>>> >>>
>>> >>> hadoopConf.set("fs.s3.awsAccessKeyId",yourAccessKey)
>>> >>>
>>> >>> hadoopConf.set("fs.s3.awsSecretAccessKey",yourSecretKey)
>>> >>>
>>> >>>
>>> >> 4. You can also try:
>>> >>
>>> >> val lines =
>>> >>
>>> >> ​s
>>> >>> parkContext.textFile("s3n://yourAccessKey:yourSecretKey@
>>> >>>
>>> > 
>>> > /path/")
>>> >>
>>> >>
>>> >> Thanks
>>> >> Best Regards
>>> >>
>>> >> On Mon, Oct 13, 2014 at 11:33 PM, Ranga <
>>>
>>> > sranga@
>>>
>>> > > wrote:
>>> >>
>>> >>> Hi
>>> >>>
>>> >>> I am trying to access files/buckets in S3 and encountering a
>>> permissions
>>> >>> issue. The buckets are configured to authenticate using an IAMRole
>>> >>> provider.
>>> >>> I have set the KeyId and Secret using environment variables (
>>> >>> AWS_SECRET_ACCESS_KEY and AWS_ACCESS_KEY_ID). However, I am still
>>> unable
>>> >>> to access the S3 buckets.
>>> >>>
>>> >>> Before setting the access key and secret the error was:
>>> >>> "java.lang.IllegalArgumentException:
>>> >>> AWS Access Key ID and Secret Access Key must be specified as the
>>> >>> username
>>> >>> or password (respectively) of a s3n URL, or by setting the
>>> >>> fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey properties
>>> >>> (respectively)."
>>> >>>
>>> >>> After setting the access key and secret, the error is: "The AWS
>>> Access
>>> >>> Key Id you provided does not exist in our records."
>>> >>>
>>> >>> The id/secret being set are the right values. This makes me believe
>>> that
>>> >>> something else ("token", etc.) needs to be set as well.
>>> >>> Any help is appreciated.
>>> >>>
>>> >>>
>>> >>> - Ranga
>>> >>>
>>> >>
>>> >>
>>> >>
>>>
>>>
>>>
>>>
>>>
>>> --
>>> View this message in context:
>>> http://apache-spark-user-list.1001560.n3.nabble.com/S3-Bucket-Access-tp16303p16397.html
>>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>>
>>> -
>>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>>> For additional commands, e-mail: user-h...@spark.apache.org
>>>
>>>
>>
>
>


Re: S3 Bucket Access

2014-10-14 Thread Rishi Pidva
;
> >>
> >> 3. Set the access key and secret key inside the hadoop configurations
> >>
> >> val hadoopConf=sparkContext.hadoopConfiguration;
> >>>
> >>> hadoopConf.set("fs.s3.impl",
> >>>> "org.apache.hadoop.fs.s3native.NativeS3FileSystem")
> >>>
> >>> hadoopConf.set("fs.s3.awsAccessKeyId",yourAccessKey)
> >>>
> >>> hadoopConf.set("fs.s3.awsSecretAccessKey",yourSecretKey)
> >>>
> >>>
> >> 4. You can also try:
> >>
> >> val lines =
> >>
> >> ​s
> >>> parkContext.textFile("s3n://yourAccessKey:yourSecretKey@
> >>>
> > 
> > /path/")
> >>
> >>
> >> Thanks
> >> Best Regards
> >>
> >> On Mon, Oct 13, 2014 at 11:33 PM, Ranga <
> 
> > sranga@
> 
> > > wrote:
> >>
> >>> Hi
> >>>
> >>> I am trying to access files/buckets in S3 and encountering a permissions
> >>> issue. The buckets are configured to authenticate using an IAMRole
> >>> provider.
> >>> I have set the KeyId and Secret using environment variables (
> >>> AWS_SECRET_ACCESS_KEY and AWS_ACCESS_KEY_ID). However, I am still unable
> >>> to access the S3 buckets.
> >>>
> >>> Before setting the access key and secret the error was:
> >>> "java.lang.IllegalArgumentException:
> >>> AWS Access Key ID and Secret Access Key must be specified as the
> >>> username
> >>> or password (respectively) of a s3n URL, or by setting the
> >>> fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey properties
> >>> (respectively)."
> >>>
> >>> After setting the access key and secret, the error is: "The AWS Access
> >>> Key Id you provided does not exist in our records."
> >>>
> >>> The id/secret being set are the right values. This makes me believe that
> >>> something else ("token", etc.) needs to be set as well.
> >>> Any help is appreciated.
> >>>
> >>>
> >>> - Ranga
> >>>
> >>
> >>
> >>
> 
> 
> 
> 
> 
> --
> View this message in context: 
> http://apache-spark-user-list.1001560.n3.nabble.com/S3-Bucket-Access-tp16303p16397.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
> 
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
> 
> 
> 



Re: S3 Bucket Access

2014-10-14 Thread Ranga
t;
>> > 
>> > /path/")
>> >>
>> >>
>> >> Thanks
>> >> Best Regards
>> >>
>> >> On Mon, Oct 13, 2014 at 11:33 PM, Ranga <
>>
>> > sranga@
>>
>> > > wrote:
>> >>
>> >>> Hi
>> >>>
>> >>> I am trying to access files/buckets in S3 and encountering a
>> permissions
>> >>> issue. The buckets are configured to authenticate using an IAMRole
>> >>> provider.
>> >>> I have set the KeyId and Secret using environment variables (
>> >>> AWS_SECRET_ACCESS_KEY and AWS_ACCESS_KEY_ID). However, I am still
>> unable
>> >>> to access the S3 buckets.
>> >>>
>> >>> Before setting the access key and secret the error was:
>> >>> "java.lang.IllegalArgumentException:
>> >>> AWS Access Key ID and Secret Access Key must be specified as the
>> >>> username
>> >>> or password (respectively) of a s3n URL, or by setting the
>> >>> fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey properties
>> >>> (respectively)."
>> >>>
>> >>> After setting the access key and secret, the error is: "The AWS Access
>> >>> Key Id you provided does not exist in our records."
>> >>>
>> >>> The id/secret being set are the right values. This makes me believe
>> that
>> >>> something else ("token", etc.) needs to be set as well.
>> >>> Any help is appreciated.
>> >>>
>> >>>
>> >>> - Ranga
>> >>>
>> >>
>> >>
>> >>
>>
>>
>>
>>
>>
>> --
>> View this message in context:
>> http://apache-spark-user-list.1001560.n3.nabble.com/S3-Bucket-Access-tp16303p16397.html
>> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
>> For additional commands, e-mail: user-h...@spark.apache.org
>>
>>
>


Re: S3 Bucket Access

2014-10-14 Thread Ranga
Thanks for the input.
Yes, I did use the "temporary" access credentials provided by the IAM role
(also detailed in the link you provided). The session token needs to be
specified and I was looking for a way to set that in the header (which
doesn't seem possible).
Looks like a static key/secret is the only option.

On Tue, Oct 14, 2014 at 10:32 AM, Gen  wrote:

> Hi,
>
> If I remember well, spark cannot use the IAMrole credentials to access to
> s3. It use at first the id/key in the environment. If it is null in the
> environment, it use the value in the file core-site.xml.  So, IAMrole is
> not
> useful for spark. The same problem happens if you want to use distcp
> command
> in hadoop.
>
>
> Do you use curl http://169.254.169.254/latest/meta-data/iam/... to get the
> "temporary" access. If yes, this code cannot use directly by spark, for
> more
> information, you can take a look
> http://docs.aws.amazon.com/STS/latest/UsingSTS/using-temp-creds.html
> <http://docs.aws.amazon.com/STS/latest/UsingSTS/using-temp-creds.html>
>
>
>
> sranga wrote
> > Thanks for the pointers.
> > I verified that the access key-id/secret used are valid. However, the
> > secret may contain "/" at times. The issues I am facing are as follows:
> >
> >- The EC2 instances are setup with an IAMRole () and don't have a
> > static
> >key-id/secret
> >- All of the EC2 instances have access to S3 based on this role (I
> used
> >s3ls and s3cp commands to verify this)
> >- I can get a "temporary" access key-id/secret based on the IAMRole
> but
> >they generally expire in an hour
> >- If Spark is not able to use the IAMRole credentials, I may have to
> >generate a static key-id/secret. This may or may not be possible in
> the
> >environment I am in (from a policy perspective)
> >
> >
> >
> > - Ranga
> >
> > On Tue, Oct 14, 2014 at 4:21 AM, Rafal Kwasny <
>
> > mag@
>
> > > wrote:
> >
> >> Hi,
> >> keep in mind that you're going to have a bad time if your secret key
> >> contains a "/"
> >> This is due to old and stupid hadoop bug:
> >> https://issues.apache.org/jira/browse/HADOOP-3733
> >>
> >> Best way is to regenerate the key so it does not include a "/"
> >>
> >> /Raf
> >>
> >>
> >> Akhil Das wrote:
> >>
> >> Try the following:
> >>
> >> 1. Set the access key and secret key in the sparkContext:
> >>
> >> sparkContext.set("
> >>> ​
> >>> AWS_ACCESS_KEY_ID",yourAccessKey)
> >>
> >> sparkContext.set("
> >>> ​
> >>> AWS_SECRET_ACCESS_KEY",yourSecretKey)
> >>
> >>
> >> 2. Set the access key and secret key in the environment before starting
> >> your application:
> >>
> >> ​
> >>>
> >> export
> >>> ​​
> >>> AWS_ACCESS_KEY_ID=
> > 
> >>
> >> export
> >>> ​​
> >>> AWS_SECRET_ACCESS_KEY=
> > 
> > ​
> >>
> >>
> >> 3. Set the access key and secret key inside the hadoop configurations
> >>
> >> val hadoopConf=sparkContext.hadoopConfiguration;
> >>>
> >>> hadoopConf.set("fs.s3.impl",
> >>>> "org.apache.hadoop.fs.s3native.NativeS3FileSystem")
> >>>
> >>> hadoopConf.set("fs.s3.awsAccessKeyId",yourAccessKey)
> >>>
> >>> hadoopConf.set("fs.s3.awsSecretAccessKey",yourSecretKey)
> >>>
> >>>
> >> 4. You can also try:
> >>
> >> val lines =
> >>
> >> ​s
> >>> parkContext.textFile("s3n://yourAccessKey:yourSecretKey@
> >>>
> > 
> > /path/")
> >>
> >>
> >> Thanks
> >> Best Regards
> >>
> >> On Mon, Oct 13, 2014 at 11:33 PM, Ranga <
>
> > sranga@
>
> > > wrote:
> >>
> >>> Hi
> >>>
> >>> I am trying to access files/buckets in S3 and encountering a
> permissions
> >>> issue. The buckets are configured to authenticate using an IAMRole
> >>> provider.
> >>> I have set the KeyId and Secret using environment variables (
> >>> AWS_SECRET_ACCESS_KEY and AWS_ACCESS_KEY_ID). However, I am still
> unable
> >>> to access the S3 buckets.
> >>>
> >>> Before setting the access key and secret the error was:
> >>> "java.lang.IllegalArgumentException:
> >>> AWS Access Key ID and Secret Access Key must be specified as the
> >>> username
> >>> or password (respectively) of a s3n URL, or by setting the
> >>> fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey properties
> >>> (respectively)."
> >>>
> >>> After setting the access key and secret, the error is: "The AWS Access
> >>> Key Id you provided does not exist in our records."
> >>>
> >>> The id/secret being set are the right values. This makes me believe
> that
> >>> something else ("token", etc.) needs to be set as well.
> >>> Any help is appreciated.
> >>>
> >>>
> >>> - Ranga
> >>>
> >>
> >>
> >>
>
>
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/S3-Bucket-Access-tp16303p16397.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> -
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>


Re: S3 Bucket Access

2014-10-14 Thread Gen
Hi,

If I remember well, spark cannot use the IAMrole credentials to access to
s3. It use at first the id/key in the environment. If it is null in the
environment, it use the value in the file core-site.xml.  So, IAMrole is not
useful for spark. The same problem happens if you want to use distcp command
in hadoop.


Do you use curl http://169.254.169.254/latest/meta-data/iam/... to get the
"temporary" access. If yes, this code cannot use directly by spark, for more
information, you can take a look 
http://docs.aws.amazon.com/STS/latest/UsingSTS/using-temp-creds.html
<http://docs.aws.amazon.com/STS/latest/UsingSTS/using-temp-creds.html>  



sranga wrote
> Thanks for the pointers.
> I verified that the access key-id/secret used are valid. However, the
> secret may contain "/" at times. The issues I am facing are as follows:
> 
>- The EC2 instances are setup with an IAMRole () and don't have a
> static
>key-id/secret
>- All of the EC2 instances have access to S3 based on this role (I used
>s3ls and s3cp commands to verify this)
>- I can get a "temporary" access key-id/secret based on the IAMRole but
>they generally expire in an hour
>- If Spark is not able to use the IAMRole credentials, I may have to
>generate a static key-id/secret. This may or may not be possible in the
>environment I am in (from a policy perspective)
> 
> 
> 
> - Ranga
> 
> On Tue, Oct 14, 2014 at 4:21 AM, Rafal Kwasny <

> mag@

> > wrote:
> 
>> Hi,
>> keep in mind that you're going to have a bad time if your secret key
>> contains a "/"
>> This is due to old and stupid hadoop bug:
>> https://issues.apache.org/jira/browse/HADOOP-3733
>>
>> Best way is to regenerate the key so it does not include a "/"
>>
>> /Raf
>>
>>
>> Akhil Das wrote:
>>
>> Try the following:
>>
>> 1. Set the access key and secret key in the sparkContext:
>>
>> sparkContext.set("
>>> ​
>>> AWS_ACCESS_KEY_ID",yourAccessKey)
>>
>> sparkContext.set("
>>> ​
>>> AWS_SECRET_ACCESS_KEY",yourSecretKey)
>>
>>
>> 2. Set the access key and secret key in the environment before starting
>> your application:
>>
>> ​
>>>
>> export
>>> ​​
>>> AWS_ACCESS_KEY_ID=
> 
>>
>> export
>>> ​​
>>> AWS_SECRET_ACCESS_KEY=
> 
> ​
>>
>>
>> 3. Set the access key and secret key inside the hadoop configurations
>>
>> val hadoopConf=sparkContext.hadoopConfiguration;
>>>
>>> hadoopConf.set("fs.s3.impl",
>>>> "org.apache.hadoop.fs.s3native.NativeS3FileSystem")
>>>
>>> hadoopConf.set("fs.s3.awsAccessKeyId",yourAccessKey)
>>>
>>> hadoopConf.set("fs.s3.awsSecretAccessKey",yourSecretKey)
>>>
>>>
>> 4. You can also try:
>>
>> val lines =
>>
>> ​s
>>> parkContext.textFile("s3n://yourAccessKey:yourSecretKey@
>>> 
> 
> /path/")
>>
>>
>> Thanks
>> Best Regards
>>
>> On Mon, Oct 13, 2014 at 11:33 PM, Ranga <

> sranga@

> > wrote:
>>
>>> Hi
>>>
>>> I am trying to access files/buckets in S3 and encountering a permissions
>>> issue. The buckets are configured to authenticate using an IAMRole
>>> provider.
>>> I have set the KeyId and Secret using environment variables (
>>> AWS_SECRET_ACCESS_KEY and AWS_ACCESS_KEY_ID). However, I am still unable
>>> to access the S3 buckets.
>>>
>>> Before setting the access key and secret the error was:
>>> "java.lang.IllegalArgumentException:
>>> AWS Access Key ID and Secret Access Key must be specified as the
>>> username
>>> or password (respectively) of a s3n URL, or by setting the
>>> fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey properties
>>> (respectively)."
>>>
>>> After setting the access key and secret, the error is: "The AWS Access
>>> Key Id you provided does not exist in our records."
>>>
>>> The id/secret being set are the right values. This makes me believe that
>>> something else ("token", etc.) needs to be set as well.
>>> Any help is appreciated.
>>>
>>>
>>> - Ranga
>>>
>>
>>
>>





--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/S3-Bucket-Access-tp16303p16397.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: S3 Bucket Access

2014-10-14 Thread Ranga
Thanks for the pointers.
I verified that the access key-id/secret used are valid. However, the
secret may contain "/" at times. The issues I am facing are as follows:

   - The EC2 instances are setup with an IAMRole () and don't have a static
   key-id/secret
   - All of the EC2 instances have access to S3 based on this role (I used
   s3ls and s3cp commands to verify this)
   - I can get a "temporary" access key-id/secret based on the IAMRole but
   they generally expire in an hour
   - If Spark is not able to use the IAMRole credentials, I may have to
   generate a static key-id/secret. This may or may not be possible in the
   environment I am in (from a policy perspective)



- Ranga

On Tue, Oct 14, 2014 at 4:21 AM, Rafal Kwasny  wrote:

> Hi,
> keep in mind that you're going to have a bad time if your secret key
> contains a "/"
> This is due to old and stupid hadoop bug:
> https://issues.apache.org/jira/browse/HADOOP-3733
>
> Best way is to regenerate the key so it does not include a "/"
>
> /Raf
>
>
> Akhil Das wrote:
>
> Try the following:
>
> 1. Set the access key and secret key in the sparkContext:
>
> sparkContext.set("
>> ​
>> AWS_ACCESS_KEY_ID",yourAccessKey)
>
> sparkContext.set("
>> ​
>> AWS_SECRET_ACCESS_KEY",yourSecretKey)
>
>
> 2. Set the access key and secret key in the environment before starting
> your application:
>
> ​
>>
> export
>> ​​
>> AWS_ACCESS_KEY_ID=
>
> export
>> ​​
>> AWS_SECRET_ACCESS_KEY=​
>
>
> 3. Set the access key and secret key inside the hadoop configurations
>
> val hadoopConf=sparkContext.hadoopConfiguration;
>>
>> hadoopConf.set("fs.s3.impl",
>>> "org.apache.hadoop.fs.s3native.NativeS3FileSystem")
>>
>> hadoopConf.set("fs.s3.awsAccessKeyId",yourAccessKey)
>>
>> hadoopConf.set("fs.s3.awsSecretAccessKey",yourSecretKey)
>>
>>
> 4. You can also try:
>
> val lines =
>
> ​s
>> parkContext.textFile("s3n://yourAccessKey:yourSecretKey@
>> /path/")
>
>
> Thanks
> Best Regards
>
> On Mon, Oct 13, 2014 at 11:33 PM, Ranga  wrote:
>
>> Hi
>>
>> I am trying to access files/buckets in S3 and encountering a permissions
>> issue. The buckets are configured to authenticate using an IAMRole provider.
>> I have set the KeyId and Secret using environment variables (
>> AWS_SECRET_ACCESS_KEY and AWS_ACCESS_KEY_ID). However, I am still unable
>> to access the S3 buckets.
>>
>> Before setting the access key and secret the error was: 
>> "java.lang.IllegalArgumentException:
>> AWS Access Key ID and Secret Access Key must be specified as the username
>> or password (respectively) of a s3n URL, or by setting the
>> fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey properties
>> (respectively)."
>>
>> After setting the access key and secret, the error is: "The AWS Access
>> Key Id you provided does not exist in our records."
>>
>> The id/secret being set are the right values. This makes me believe that
>> something else ("token", etc.) needs to be set as well.
>> Any help is appreciated.
>>
>>
>> - Ranga
>>
>
>
>


Re: S3 Bucket Access

2014-10-14 Thread Rafal Kwasny
Hi,
keep in mind that you're going to have a bad time if your secret key
contains a "/"
This is due to old and stupid hadoop bug:
https://issues.apache.org/jira/browse/HADOOP-3733

Best way is to regenerate the key so it does not include a "/"

/Raf

Akhil Das wrote:
> Try the following:
>
> 1. Set the access key and secret key in the sparkContext:
>
> sparkContext.set("
> ​
> AWS_ACCESS_KEY_ID",yourAccessKey)
>
> sparkContext.set("
> ​
> AWS_SECRET_ACCESS_KEY",yourSecretKey)
>
>
> 2. Set the access key and secret key in the environment before
> starting your application:
>
> ​
>
> export
> ​​
> AWS_ACCESS_KEY_ID=
>
> export
> ​​
> AWS_SECRET_ACCESS_KEY=​
>
>
> 3. Set the access key and secret key inside the hadoop configurations
>
> val hadoopConf=sparkContext.hadoopConfiguration;
>
> hadoopConf.set("fs.s3.impl",
> "org.apache.hadoop.fs.s3native.NativeS3FileSystem")
>
> hadoopConf.set("fs.s3.awsAccessKeyId",yourAccessKey)
>
> hadoopConf.set("fs.s3.awsSecretAccessKey",yourSecretKey)
>
>
> 4. You can also try:
>
> val lines = 
>
> ​s
> 
> parkContext.textFile("s3n://yourAccessKey:yourSecretKey@/path/") 
>
>
> Thanks
> Best Regards
>
> On Mon, Oct 13, 2014 at 11:33 PM, Ranga  > wrote:
>
> Hi
>
> I am trying to access files/buckets in S3 and encountering a
> permissions issue. The buckets are configured to authenticate
> using an IAMRole provider.
> I have set the KeyId and Secret using environment variables
> (AWS_SECRET_ACCESS_KEY and AWS_ACCESS_KEY_ID). However, I am still
> unable to access the S3 buckets. 
>
> Before setting the access key and secret the error was:
> "java.lang.IllegalArgumentException: AWS Access Key ID and Secret
> Access Key must be specified as the username or password
> (respectively) of a s3n URL, or by setting the
> fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey properties
> (respectively)."
>
> After setting the access key and secret, the error is: "The AWS
> Access Key Id you provided does not exist in our records."
>
> The id/secret being set are the right values. This makes me
> believe that something else ("token", etc.) needs to be set as well. 
> Any help is appreciated. 
>
>
> - Ranga
>
>



Re: S3 Bucket Access

2014-10-14 Thread Akhil Das
Try the following:

1. Set the access key and secret key in the sparkContext:

sparkContext.set("
> ​
> AWS_ACCESS_KEY_ID",yourAccessKey)

sparkContext.set("
> ​
> AWS_SECRET_ACCESS_KEY",yourSecretKey)


2. Set the access key and secret key in the environment before starting
your application:

​
>
export
> ​​
> AWS_ACCESS_KEY_ID=

export
> ​​
> AWS_SECRET_ACCESS_KEY=​


3. Set the access key and secret key inside the hadoop configurations

val hadoopConf=sparkContext.hadoopConfiguration;
>
> hadoopConf.set("fs.s3.impl",
>> "org.apache.hadoop.fs.s3native.NativeS3FileSystem")
>
> hadoopConf.set("fs.s3.awsAccessKeyId",yourAccessKey)
>
> hadoopConf.set("fs.s3.awsSecretAccessKey",yourSecretKey)
>
>
4. You can also try:

val lines =

​s
> parkContext.textFile("s3n://yourAccessKey:yourSecretKey@
> /path/")


Thanks
Best Regards

On Mon, Oct 13, 2014 at 11:33 PM, Ranga  wrote:

> Hi
>
> I am trying to access files/buckets in S3 and encountering a permissions
> issue. The buckets are configured to authenticate using an IAMRole provider.
> I have set the KeyId and Secret using environment variables (
> AWS_SECRET_ACCESS_KEY and AWS_ACCESS_KEY_ID). However, I am still unable
> to access the S3 buckets.
>
> Before setting the access key and secret the error was: 
> "java.lang.IllegalArgumentException:
> AWS Access Key ID and Secret Access Key must be specified as the username
> or password (respectively) of a s3n URL, or by setting the
> fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey properties
> (respectively)."
>
> After setting the access key and secret, the error is: "The AWS Access
> Key Id you provided does not exist in our records."
>
> The id/secret being set are the right values. This makes me believe that
> something else ("token", etc.) needs to be set as well.
> Any help is appreciated.
>
>
> - Ranga
>


Re: S3 Bucket Access

2014-10-14 Thread Gen
Hi,

Are you sure that the id/key that you used can access to s3? You can try to
use the same id/key through python boto package to test it.

Because, I have almost the same situation as yours, but I can access to s3.

Best



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/S3-Bucket-Access-tp16303p16366.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

-
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org



Re: S3 Bucket Access

2014-10-13 Thread Daniil Osipov
There is detailed information available in the official documentation[1].
If you don't have a key pair, you can generate one as described in AWS
documentation [2]. That should be enough to get started.

[1] http://spark.apache.org/docs/latest/ec2-scripts.html
[2] http://docs.aws.amazon.com/AWSEC2/latest/UserGuide/ec2-key-pairs.html

On Mon, Oct 13, 2014 at 4:07 PM, Ranga  wrote:

> Hi Daniil
>
> Could you provide some more details on how the cluster should be
> launched/configured? The EC2 instance that I am dealing with uses the
> concept of IAMRoles. I don't have any "keyfile" to specify to the spark-ec2
> script.
> Thanks for your help.
>
>
> - Ranga
>
> On Mon, Oct 13, 2014 at 3:04 PM, Daniil Osipov 
> wrote:
>
>> (Copying the user list)
>> You should use spark_ec2 script to configure the cluster. If you use
>> trunk version you can use the new --copy-aws-credentials option to
>> configure the S3 parameters automatically, otherwise either include them in
>> your SparkConf variable or add them to
>> /root/spark/ephemeral-hdfs/conf/core-site.xml
>>
>> On Mon, Oct 13, 2014 at 2:56 PM, Ranga  wrote:
>>
>>> The cluster is deployed on EC2 and I am trying to access the S3 files
>>> from within a spark-shell session.
>>>
>>> On Mon, Oct 13, 2014 at 2:51 PM, Daniil Osipov >> > wrote:
>>>
 So is your cluster running on EC2, or locally? If you're running
 locally, you should still be able to access S3 files, you just need to
 locate the core-site.xml and add the parameters as defined in the error.

 On Mon, Oct 13, 2014 at 2:49 PM, Ranga  wrote:

> Hi Daniil
>
> No. I didn't create the spark-cluster using the ec2 scripts. Is that
> something that I need to do? I just downloaded Spark-1.1.0 and Hadoop-2.4.
> However, I am trying to access files on S3 from this cluster.
>
>
> - Ranga
>
> On Mon, Oct 13, 2014 at 2:36 PM, Daniil Osipov <
> daniil.osi...@shazam.com> wrote:
>
>> Did you add the fs.s3n.aws* configuration parameters in
>> /root/spark/ephemeral-hdfs/conf/core-ste.xml?
>>
>> On Mon, Oct 13, 2014 at 11:03 AM, Ranga  wrote:
>>
>>> Hi
>>>
>>> I am trying to access files/buckets in S3 and encountering a
>>> permissions issue. The buckets are configured to authenticate using an
>>> IAMRole provider.
>>> I have set the KeyId and Secret using environment variables (
>>> AWS_SECRET_ACCESS_KEY and AWS_ACCESS_KEY_ID). However, I am still
>>> unable to access the S3 buckets.
>>>
>>> Before setting the access key and secret the error was: 
>>> "java.lang.IllegalArgumentException:
>>> AWS Access Key ID and Secret Access Key must be specified as the 
>>> username
>>> or password (respectively) of a s3n URL, or by setting the
>>> fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey properties
>>> (respectively)."
>>>
>>> After setting the access key and secret, the error is: "The AWS
>>> Access Key Id you provided does not exist in our records."
>>>
>>> The id/secret being set are the right values. This makes me believe
>>> that something else ("token", etc.) needs to be set as well.
>>> Any help is appreciated.
>>>
>>>
>>> - Ranga
>>>
>>
>>
>

>>>
>>
>


Re: S3 Bucket Access

2014-10-13 Thread Ranga
Hi Daniil

Could you provide some more details on how the cluster should be
launched/configured? The EC2 instance that I am dealing with uses the
concept of IAMRoles. I don't have any "keyfile" to specify to the spark-ec2
script.
Thanks for your help.


- Ranga

On Mon, Oct 13, 2014 at 3:04 PM, Daniil Osipov 
wrote:

> (Copying the user list)
> You should use spark_ec2 script to configure the cluster. If you use trunk
> version you can use the new --copy-aws-credentials option to configure the
> S3 parameters automatically, otherwise either include them in your
> SparkConf variable or add them to
> /root/spark/ephemeral-hdfs/conf/core-site.xml
>
> On Mon, Oct 13, 2014 at 2:56 PM, Ranga  wrote:
>
>> The cluster is deployed on EC2 and I am trying to access the S3 files
>> from within a spark-shell session.
>>
>> On Mon, Oct 13, 2014 at 2:51 PM, Daniil Osipov 
>> wrote:
>>
>>> So is your cluster running on EC2, or locally? If you're running
>>> locally, you should still be able to access S3 files, you just need to
>>> locate the core-site.xml and add the parameters as defined in the error.
>>>
>>> On Mon, Oct 13, 2014 at 2:49 PM, Ranga  wrote:
>>>
 Hi Daniil

 No. I didn't create the spark-cluster using the ec2 scripts. Is that
 something that I need to do? I just downloaded Spark-1.1.0 and Hadoop-2.4.
 However, I am trying to access files on S3 from this cluster.


 - Ranga

 On Mon, Oct 13, 2014 at 2:36 PM, Daniil Osipov <
 daniil.osi...@shazam.com> wrote:

> Did you add the fs.s3n.aws* configuration parameters in
> /root/spark/ephemeral-hdfs/conf/core-ste.xml?
>
> On Mon, Oct 13, 2014 at 11:03 AM, Ranga  wrote:
>
>> Hi
>>
>> I am trying to access files/buckets in S3 and encountering a
>> permissions issue. The buckets are configured to authenticate using an
>> IAMRole provider.
>> I have set the KeyId and Secret using environment variables (
>> AWS_SECRET_ACCESS_KEY and AWS_ACCESS_KEY_ID). However, I am still
>> unable to access the S3 buckets.
>>
>> Before setting the access key and secret the error was: 
>> "java.lang.IllegalArgumentException:
>> AWS Access Key ID and Secret Access Key must be specified as the username
>> or password (respectively) of a s3n URL, or by setting the
>> fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey properties
>> (respectively)."
>>
>> After setting the access key and secret, the error is: "The AWS
>> Access Key Id you provided does not exist in our records."
>>
>> The id/secret being set are the right values. This makes me believe
>> that something else ("token", etc.) needs to be set as well.
>> Any help is appreciated.
>>
>>
>> - Ranga
>>
>
>

>>>
>>
>


Re: S3 Bucket Access

2014-10-13 Thread Daniil Osipov
(Copying the user list)
You should use spark_ec2 script to configure the cluster. If you use trunk
version you can use the new --copy-aws-credentials option to configure the
S3 parameters automatically, otherwise either include them in your
SparkConf variable or add them to
/root/spark/ephemeral-hdfs/conf/core-site.xml

On Mon, Oct 13, 2014 at 2:56 PM, Ranga  wrote:

> The cluster is deployed on EC2 and I am trying to access the S3 files from
> within a spark-shell session.
>
> On Mon, Oct 13, 2014 at 2:51 PM, Daniil Osipov 
> wrote:
>
>> So is your cluster running on EC2, or locally? If you're running locally,
>> you should still be able to access S3 files, you just need to locate the
>> core-site.xml and add the parameters as defined in the error.
>>
>> On Mon, Oct 13, 2014 at 2:49 PM, Ranga  wrote:
>>
>>> Hi Daniil
>>>
>>> No. I didn't create the spark-cluster using the ec2 scripts. Is that
>>> something that I need to do? I just downloaded Spark-1.1.0 and Hadoop-2.4.
>>> However, I am trying to access files on S3 from this cluster.
>>>
>>>
>>> - Ranga
>>>
>>> On Mon, Oct 13, 2014 at 2:36 PM, Daniil Osipov >> > wrote:
>>>
 Did you add the fs.s3n.aws* configuration parameters in
 /root/spark/ephemeral-hdfs/conf/core-ste.xml?

 On Mon, Oct 13, 2014 at 11:03 AM, Ranga  wrote:

> Hi
>
> I am trying to access files/buckets in S3 and encountering a
> permissions issue. The buckets are configured to authenticate using an
> IAMRole provider.
> I have set the KeyId and Secret using environment variables (
> AWS_SECRET_ACCESS_KEY and AWS_ACCESS_KEY_ID). However, I am still
> unable to access the S3 buckets.
>
> Before setting the access key and secret the error was: 
> "java.lang.IllegalArgumentException:
> AWS Access Key ID and Secret Access Key must be specified as the username
> or password (respectively) of a s3n URL, or by setting the
> fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey properties
> (respectively)."
>
> After setting the access key and secret, the error is: "The AWS
> Access Key Id you provided does not exist in our records."
>
> The id/secret being set are the right values. This makes me believe
> that something else ("token", etc.) needs to be set as well.
> Any help is appreciated.
>
>
> - Ranga
>


>>>
>>
>


Re: S3 Bucket Access

2014-10-13 Thread Ranga
Is there a way to specify a request header during the
.textFile call?


- Ranga

On Mon, Oct 13, 2014 at 11:03 AM, Ranga  wrote:

> Hi
>
> I am trying to access files/buckets in S3 and encountering a permissions
> issue. The buckets are configured to authenticate using an IAMRole provider.
> I have set the KeyId and Secret using environment variables (
> AWS_SECRET_ACCESS_KEY and AWS_ACCESS_KEY_ID). However, I am still unable
> to access the S3 buckets.
>
> Before setting the access key and secret the error was: 
> "java.lang.IllegalArgumentException:
> AWS Access Key ID and Secret Access Key must be specified as the username
> or password (respectively) of a s3n URL, or by setting the
> fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey properties
> (respectively)."
>
> After setting the access key and secret, the error is: "The AWS Access
> Key Id you provided does not exist in our records."
>
> The id/secret being set are the right values. This makes me believe that
> something else ("token", etc.) needs to be set as well.
> Any help is appreciated.
>
>
> - Ranga
>


S3 Bucket Access

2014-10-13 Thread Ranga
Hi

I am trying to access files/buckets in S3 and encountering a permissions
issue. The buckets are configured to authenticate using an IAMRole provider.
I have set the KeyId and Secret using environment variables (
AWS_SECRET_ACCESS_KEY and AWS_ACCESS_KEY_ID). However, I am still unable to
access the S3 buckets.

Before setting the access key and secret the error was:
"java.lang.IllegalArgumentException:
AWS Access Key ID and Secret Access Key must be specified as the username
or password (respectively) of a s3n URL, or by setting the
fs.s3n.awsAccessKeyId or fs.s3n.awsSecretAccessKey properties
(respectively)."

After setting the access key and secret, the error is: "The AWS Access Key
Id you provided does not exist in our records."

The id/secret being set are the right values. This makes me believe that
something else ("token", etc.) needs to be set as well.
Any help is appreciated.


- Ranga