Have you verified that you can download the file from bucket-name without using 
Spark ?

Seems like permission issue. 

Cheers



> On May 15, 2015, at 5:09 AM, Mohammad Tariq <donta...@gmail.com> wrote:
> 
> Hello list,
> 
> Scenario : I am trying to read an Avro file stored in S3 and create a 
> DataFrame out of it using Spark-Avro library, but unable to do so. This is 
> the code which I am using :
> 
> public class S3DataFrame {
> 
>       public static void main(String[] args) {
> 
>               System.out.println("START...");
>               SparkConf conf = new 
> SparkConf().setAppName("DataFrameDemo").setMaster("local");
>               JavaSparkContext sc = new JavaSparkContext(conf);
>               Configuration config = sc.hadoopConfiguration();
>               config.set("fs.s3a.impl", 
> "org.apache.hadoop.fs.s3a.S3AFileSystem");
>               config.set("fs.s3a.access.key","****************");
>               config.set("fs.s3a.secret.key","*****************");
>               config.set("fs.s3a.endpoint", "s3-us-west-2.amazonaws.com");
>               SQLContext sqlContext = new SQLContext(sc);
>               DataFrame df = sqlContext.load("s3a://bucket-name/file.avro", 
> "com.databricks.spark.avro");
>               df.show();
>               df.printSchema();
>               df.select("title").show();
>               System.out.println("DONE");
> //            df.save("/new/dir/", "com.databricks.spark.avro");
>       }
> }
> 
> Problem : Getting Exception in thread "main" 
> com.amazonaws.services.s3.model.AmazonS3Exception: Forbidden (Service: Amazon 
> S3; Status Code: 403; Error Code: 403 Forbidden; 
> 
> And this is the complete exception trace :
> 
> Exception in thread "main" com.amazonaws.services.s3.model.AmazonS3Exception: 
> Forbidden (Service: Amazon S3; Status Code: 403; Error Code: 403 Forbidden; 
> Request ID: 63A603F1DC6FB900), S3 Extended Request ID: 
> vh5XhXSVO5ZvhX8c4I3tOWQD/T+B0ZW/MCYzUnuNnQ0R2JoBmJ0MPmUePRiQnPVASTbkonoFPIg=
>       at 
> com.amazonaws.http.AmazonHttpClient.handleErrorResponse(AmazonHttpClient.java:1088)
>       at 
> com.amazonaws.http.AmazonHttpClient.executeOneRequest(AmazonHttpClient.java:735)
>       at 
> com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:461)
>       at 
> com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:296)
>       at 
> com.amazonaws.services.s3.AmazonS3Client.invoke(AmazonS3Client.java:3743)
>       at 
> com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1027)
>       at 
> com.amazonaws.services.s3.AmazonS3Client.getObjectMetadata(AmazonS3Client.java:1005)
>       at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:688)
>       at 
> org.apache.hadoop.fs.s3a.S3AFileSystem.getFileStatus(S3AFileSystem.java:71)
>       at org.apache.hadoop.fs.Globber.getFileStatus(Globber.java:57)
>       at org.apache.hadoop.fs.Globber.glob(Globber.java:248)
>       at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1623)
>       at 
> com.databricks.spark.avro.AvroRelation.newReader(AvroRelation.scala:105)
>       at com.databricks.spark.avro.AvroRelation.<init>(AvroRelation.scala:60)
>       at 
> com.databricks.spark.avro.DefaultSource.createRelation(DefaultSource.scala:41)
>       at org.apache.spark.sql.sources.ResolvedDataSource$.apply(ddl.scala:219)
>       at org.apache.spark.sql.SQLContext.load(SQLContext.scala:697)
>       at org.apache.spark.sql.SQLContext.load(SQLContext.scala:673)
>       at org.myorg.dataframe.S3DataFrame.main(S3DataFrame.java:25)
> 
> 
> Would really appreciate some help. Thank you so much for your precious time.
> 
> Software versions used :
> spark-1.3.1-bin-hadoop2.4
> hadoop-aws-2.6.0.jar
> MAC OS X 10.10.3
> java version "1.6.0_65"
>  
> 
> Tariq, Mohammad
> about.me/mti
> 
> 
>                               
>  

Reply via email to