Spark 0.9.0
Hi Spark devs, I'm pretty sure this stacktrace is a bug in the way Spark is using the type system but I don't quite know what it is. Something to do with type bounds judging from my Googling. Can someone with more Scala-foo than me please take a look? In the meantime I'll be avoiding top() for a bit. Thanks! Andrew This stracktrace came about when I called val myRDD: RDD[(String,Int)] = ... myRDD.reduceByKey(_+_).top(100) But my toy example doesn't trigger the repro: sc.parallelize(Seq( ("A",10), ("B",5), ("A",4), ("C", 15) )).reduceByKey(_+_).top(2) 14/02/14 03:15:07 ERROR OneForOneStrategy: scala.collection.immutable.$colon$colon cannot be cast to org.apache.spark.util.BoundedPriorityQueue java.lang.ClassCastException: scala.collection.immutable.$colon$colon cannot be cast to org.apache.spark.util.BoundedPriorityQueue at org.apache.spark.rdd.RDD$$anonfun$top$2.apply(RDD.scala:873) at org.apache.spark.rdd.RDD$$anonfun$6.apply(RDD.scala:671) at org.apache.spark.rdd.RDD$$anonfun$6.apply(RDD.scala:668) at org.apache.spark.scheduler.JobWaiter.taskSucceeded(JobWaiter.scala:56) at org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:859) at org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:616) at org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:207) at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) at akka.actor.ActorCell.invoke(ActorCell.scala:456) at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) at akka.dispatch.Mailbox.run(Mailbox.scala:219) at akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) at scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) at scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) at scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107)