It looks like this has been broken around Spark 1.5.
Please see JIRA SPARK-10185. This has been fixed in pyspark but unfortunately 
SparkR was missed. I have confirmed this is still broken in Spark 1.6.
Could you please open a JIRA?






On Thu, Dec 3, 2015 at 2:08 PM -0800, "tomasr3" 
<tomas.rodrig...@transvoyant.com> wrote:





Hello,

I believe to have encountered a bug with Spark 1.5.2. I am using RStudio and
SparkR to read in JSON files with jsonFile(sqlContext, "path"). If "path" is
a single path (e.g., "/path/to/dir0"), then it works fine;

but, when "path" is a vector of paths (e.g.

path <- c("/path/to/dir1","/path/to/dir2"), then I get the following error
message:

> raw.terror<-jsonFile(sqlContext,path)
15/12/03 15:59:55 ERROR RBackendHandler: jsonFile on 1 failed
Error in invokeJava(isStatic = FALSE, objId$id, methodName, ...) :
  java.io.IOException: No input paths specified in job
        at
org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:201)
        at
org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:313)
        at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:207)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
        at scala.Option.getOrElse(Option.scala:120)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
        at
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
        at scala.Option.getOrElse(Option.scala:120)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
        at
org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2

Note that passing a vector of paths in Spark-1.4.1 works just fine. Any help
is greatly appreciated if this is not a bug and perhaps an environment or
different issue.

Best,
T



--
View this message in context: 
http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-in-Spark-1-5-2-jsonFile-Bug-Found-tp25560.html
Sent from the Apache Spark User List mailing list archive at Nabble.com.

---------------------------------------------------------------------
To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
For additional commands, e-mail: user-h...@spark.apache.org

Reply via email to