Not sure if it's supposed to work. Can you try newAPIHadoopFile() passing in the required configuration object.
On Tue, Oct 7, 2014 at 4:20 AM, Tomer Benyamini <tomer....@gmail.com> wrote: > Hello, > > I'm trying to read from s3 using a simple spark java app: > > --------------------- > > SparkConf sparkConf = new SparkConf().setAppName("TestApp"); > sparkConf.setMaster("local"); > JavaSparkContext sc = new JavaSparkContext(sparkConf); > sc.hadoopConfiguration().set("fs.s3.awsAccessKeyId", "XXXXXX"); > sc.hadoopConfiguration().set("fs.s3.awsSecretAccessKey", "XXXXXX"); > > String path = "s3://bucket/test/testdata"; > JavaRDD<String> textFile = sc.textFile(path); > System.out.println(textFile.count()); > > --------------------- > But getting this error: > > org.apache.hadoop.mapred.InvalidInputException: Input path does not > exist: s3://bucket/test/testdata > at > org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:251) > at > org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:270) > at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:175) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202) > at scala.Option.getOrElse(Option.scala:120) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:202) > at org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:28) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204) > at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202) > at scala.Option.getOrElse(Option.scala:120) > at org.apache.spark.rdd.RDD.partitions(RDD.scala:202) > at org.apache.spark.SparkContext.runJob(SparkContext.scala:1097) > at org.apache.spark.rdd.RDD.count(RDD.scala:861) > at org.apache.spark.api.java.JavaRDDLike$class.count(JavaRDDLike.scala:365) > at org.apache.spark.api.java.JavaRDD.count(JavaRDD.scala:29) > .... > > Looking at the debug log I see that > org.jets3t.service.impl.rest.httpclient.RestS3Service returned 404 > error trying to locate the file. > > Using a simple java program with > com.amazonaws.services.s3.AmazonS3Client works just fine. > > Any idea? > > Thanks, > Tomer > > --------------------------------------------------------------------- > To unsubscribe, e-mail: user-unsubscr...@spark.apache.org > For additional commands, e-mail: user-h...@spark.apache.org > >