My exception stack looks about the same.

java.io.FileNotFoundException: File /user/me/target/capacity-scheduler.xml
does not exist.
        at
org.apache.hadoop.fs.RawLocalFileSystem.getFileStatus(RawLocalFileSystem.java:397)
        at
org.apache.hadoop.fs.FilterFileSystem.getFileStatus(FilterFileSystem.java:251)
        at
org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat$OneFileInfo.<init>(CombineFileInputFormat.java:489)
        at
org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getMoreSplits(CombineFileInputFormat.java:280)
        at
org.apache.hadoop.mapreduce.lib.input.CombineFileInputFormat.getSplits(CombineFileInputFormat.java:240)
        at
org.apache.spark.rdd.WholeTextFileRDD.getPartitions(NewHadoopRDD.scala:173)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202)
        at scala.Option.getOrElse(Option.scala:120)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:202)
        at org.apache.spark.SparkContext.runJob(SparkContext.scala:1094)
        at org.apache.spark.rdd.RDD.collect(RDD.scala:717)

I'm using Hadoop 1.2.1, and everything else I've tried in Spark with that
version has worked, so I doubt it's a version error.



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/wholeTextFiles-not-working-with-HDFS-tp7490p7570.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

Reply via email to