[ https://issues.apache.org/jira/browse/SPARK-5744?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Tobias Bertelsen updated SPARK-5744: ------------------------------------ Description: The implementation of {{RDD.isEmpty()}} fails if there is empty partitions. It was introduce by https://github.com/apache/spark/pull/4074 Example: {code} sc.parallelize(Seq(), 1).isEmpty() {code} The above code throws an exception like this: {code} org.apache.spark.SparkDriverExecutionException: Execution error at org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:977) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1374) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1338) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) Cause: java.lang.ArrayStoreException: [Ljava.lang.Object; at scala.runtime.ScalaRunTime$.array_update(ScalaRunTime.scala:88) at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1466) at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1466) at org.apache.spark.scheduler.JobWaiter.taskSucceeded(JobWaiter.scala:56) at org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:973) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1374) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1338) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) {code} was: The implementation of {{RDD.isEmpty()}} fails if there is empty partitions. It was introduce by https://github.com/apache/spark/pull/4074 Example: {code:scala} sc.parallelize(Seq(), 1).isEmpty() {code} The above code throws an exception like this: {code} org.apache.spark.SparkDriverExecutionException: Execution error at org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:977) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1374) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1338) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) Cause: java.lang.ArrayStoreException: [Ljava.lang.Object; at scala.runtime.ScalaRunTime$.array_update(ScalaRunTime.scala:88) at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1466) at org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1466) at org.apache.spark.scheduler.JobWaiter.taskSucceeded(JobWaiter.scala:56) at org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:973) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1374) at org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1338) at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) {code} > RDD.isEmpty fails when rdd contains empty partitions. > ----------------------------------------------------- > > Key: SPARK-5744 > URL: https://issues.apache.org/jira/browse/SPARK-5744 > Project: Spark > Issue Type: Bug > Components: Spark Core > Reporter: Tobias Bertelsen > Priority: Critical > Original Estimate: 0h > Remaining Estimate: 0h > > The implementation of {{RDD.isEmpty()}} fails if there is empty partitions. > It was introduce by https://github.com/apache/spark/pull/4074 > Example: > {code} > sc.parallelize(Seq(), 1).isEmpty() > {code} > The above code throws an exception like this: > {code} > org.apache.spark.SparkDriverExecutionException: Execution error > at > org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:977) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1374) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1338) > at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) > Cause: java.lang.ArrayStoreException: [Ljava.lang.Object; > at scala.runtime.ScalaRunTime$.array_update(ScalaRunTime.scala:88) > at > org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1466) > at > org.apache.spark.SparkContext$$anonfun$runJob$4.apply(SparkContext.scala:1466) > at org.apache.spark.scheduler.JobWaiter.taskSucceeded(JobWaiter.scala:56) > at > org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:973) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1374) > at > org.apache.spark.scheduler.DAGSchedulerEventProcessLoop.onReceive(DAGScheduler.scala:1338) > at org.apache.spark.util.EventLoop$$anon$1.run(EventLoop.scala:48) > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org