I am using spark-1.6.0 and java. I created a cluster using spark-ec2. I am having a heck of time figuring out how to write from my streaming app to AWS s3. I should mention I have never used s3 before and am not sure it is set up correctly.
org.apache.hadoop.fs.s3.S3Exception: org.jets3t.service.S3ServiceException: S3 HEAD request failed for /com.newco.test' - ResponseCode=403, ResponseMessage=Forbidden at org.apache.hadoop.fs.s3native.Jets3tNativeFileSystemStore.handleServiceExcep tion(Jets3tNativeFileSystemStore.java:229) Here is what I did 1. on my local mac. > 1. Export AWS_ACCESS_KEY_ID=MyKey > 2. Export AWS_SECRET_ACCESS_KEY=mySecret > 3. $ spark-1.6.0-bin-hadoop2.6/ec2/spark-ec2 --key-pair=ec2-main > --identity-file=~/.ssh/ec2-main.pem --region=us-west-1 --slaves=3 > --copy-aws-credentials launch test_S3 2. I logged into AWS S3 console https://console.aws.amazon.com/s3/home?region=us-west-1# > 1. Create a bucket com.newco.test 1) Is my URL correct? s3n://s3-us-west-1.amazonaws.com/com.newco.test/streaming 2) copy-aws-credentials did not seem to do anything Based on log error messages and Googling it looks like my app need to set up the hadoop configuration. What keys need to be set? JavaSparkContext jsc = new JavaSparkContext(conf); jsc.hadoopConfiguration().set( ) "fs.s3.awsAccessKeyId² or "fs.s3n.awsAccessKeyId" "fs.s3.awsSecretAccessKey² or ³fs.s3n.awsSecretAccessKey" "fs.s3.impl² ??? Kind regards Andy