Re: SparkR in Spark 1.5.2 jsonFile Bug Found

Yanbo Liang Fri, 04 Dec 2015 01:13:08 -0800

I have created SPARK-12146 to track this issue.

2015-12-04 9:16 GMT+08:00 Felix Cheung <felixcheun...@hotmail.com>:


> It looks like this has been broken around Spark 1.5.
>
> Please see JIRA SPARK-10185. This has been fixed in pyspark but
> unfortunately SparkR was missed. I have confirmed this is still broken in
> Spark 1.6.
>
> Could you please open a JIRA?
>
>
>
>
>
> On Thu, Dec 3, 2015 at 2:08 PM -0800, "tomasr3" <
> tomas.rodrig...@transvoyant.com> wrote:
>
> Hello,
>
> I believe to have encountered a bug with Spark 1.5.2. I am using RStudio
> and
> SparkR to read in JSON files with jsonFile(sqlContext, "path"). If "path"
> is
> a single path (e.g., "/path/to/dir0"), then it works fine;
>
> but, when "path" is a vector of paths (e.g.
>
> path <- c("/path/to/dir1","/path/to/dir2"), then I get the following error
> message:
>
> > raw.terror<-jsonFile(sqlContext,path)
> 15/12/03 15:59:55 ERROR RBackendHandler: jsonFile on 1 failed
> Error in invokeJava(isStatic = FALSE, objId$id, methodName, ...) :
>   java.io.IOException: No input paths specified in job
>         at
>
> org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:201)
>         at
>
> org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:313)
>         at
> org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:207)
>         at
> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
>         at
> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
>         at scala.Option.getOrElse(Option.scala:120)
>         at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
>         at
>
> org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
>         at
> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
>         at
> org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
>         at scala.Option.getOrElse(Option.scala:120)
>         at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
>         at
>
> org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
>         at org.apache.spark.rdd.RDD$$anonfun$partitions$2
>
> Note that passing a vector of paths in Spark-1.4.1 works just fine. Any
> help
> is greatly appreciated if this is not a bug and perhaps an environment or
> different issue.
>
> Best,
> T
>
>
>
> --
> View this message in context:
> http://apache-spark-user-list.1001560.n3.nabble.com/SparkR-in-Spark-1-5-2-jsonFile-Bug-Found-tp25560.html
> Sent from the Apache Spark User List mailing list archive at Nabble.com.
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: user-unsubscr...@spark.apache.org
> For additional commands, e-mail: user-h...@spark.apache.org
>
>

Re: SparkR in Spark 1.5.2 jsonFile Bug Found

Reply via email to