I'm sorry I do not yet have a consistent repro that doesn't use my internal dataset. That example is the repro I attempted to create but it does not trigger the issue. Hopefully I'll be able to get a consistent repro to share.
On Fri, Feb 14, 2014 at 3:58 AM, Xuefeng Wu <ben...@gmail.com> wrote: > Hi Andrew, > > Sorry, I can not reproduce the issue by: > > scala> import org.apache.spark.rdd.RDD > > scala> val myRDD: RDD[(String,Int)] = sc.parallelize(Seq( ("A",10), > ("B",5), ("A",4), ("C", 15))) > > scala> myRDD.reduceByKey(_+_).top(2) > > Any different compare with your example ? > > > > On Fri, Feb 14, 2014 at 7:38 PM, Andrew Ash <and...@andrewash.com> wrote: > > > Spark 0.9.0 > > > > > > Hi Spark devs, > > > > I'm pretty sure this stacktrace is a bug in the way Spark is using the > type > > system but I don't quite know what it is. Something to do with type > bounds > > judging from my Googling. > > > > Can someone with more Scala-foo than me please take a look? In the > > meantime I'll be avoiding top() for a bit. > > > > Thanks! > > Andrew > > > > > > > > This stracktrace came about when I called > > val myRDD: RDD[(String,Int)] = ... > > myRDD.reduceByKey(_+_).top(100) > > > > But my toy example doesn't trigger the repro: > > sc.parallelize(Seq( ("A",10), ("B",5), ("A",4), ("C", 15) > > )).reduceByKey(_+_).top(2) > > > > > > > > 14/02/14 03:15:07 ERROR OneForOneStrategy: > > scala.collection.immutable.$colon$colon cannot be cast to > > org.apache.spark.util.BoundedPriorityQueue > > java.lang.ClassCastException: scala.collection.immutable.$colon$colon > > cannot be cast to org.apache.spark.util.BoundedPriorityQueue > > at org.apache.spark.rdd.RDD$$anonfun$top$2.apply(RDD.scala:873) > > at org.apache.spark.rdd.RDD$$anonfun$6.apply(RDD.scala:671) > > at org.apache.spark.rdd.RDD$$anonfun$6.apply(RDD.scala:668) > > at > > org.apache.spark.scheduler.JobWaiter.taskSucceeded(JobWaiter.scala:56) > > at > > > > > org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:859) > > at > > > > > org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:616) > > at > > > > > org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:207) > > at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) > > at akka.actor.ActorCell.invoke(ActorCell.scala:456) > > at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) > > at akka.dispatch.Mailbox.run(Mailbox.scala:219) > > at > > > > > akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) > > at > > scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) > > at > > > > > scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) > > at > > scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) > > at > > > > > scala.concurrent.forkjoin.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:107) > > > > > > -- > > ~Yours, Xuefeng Wu/吴雪峰 敬上 >