I think the standard S3 driver used in Spark from the Hadoop project (S3n) doesn't support IAM role based authentication.
However, S3a should support it. If you're running Hadoop 2.6 via the spark-ec2 scripts (I'm not sure what it launches with by default) try accessing your bucket via s3a:// URLs instead of s3n:// http://wiki.apache.org/hadoop/AmazonS3 https://issues.apache.org/jira/browse/HADOOP-10400 Thanks, Ewan -----Original Message----- From: Greg Anderson [mailto:gregory.ander...@familysearch.org] Sent: 22 July 2015 18:00 To: user@spark.apache.org Subject: Help accessing protected S3 I have a protected s3 bucket that requires a certain IAM role to access. When I start my cluster using the spark-ec2 script, everything works just fine until I try to read from that part of s3. Here is the command I am using: ./spark-ec2 -k KEY -i KEY_FILE.pem --additional-security-group=IAM_ROLE --copy-aws-credentials --zone=us-east-1e -t m1.large --worker-instances=3 --hadoop-major-version=2.7.1 --user-data=test.sh launch my-cluster I have read through this article: http://apache-spark-user-list.1001560.n3.nabble.com/S3-Bucket-Access-td16303.html The problem seems to be very similar, but I wasn't able to find a solution in it for me. I'm not sure what else to provide here, just let me know what you need. Thanks in advance! --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org