I am using the latest from github compiled locally On Sat, Feb 22, 2014 at 3:22 PM, Tathagata Das <tathagata.das1...@gmail.com> wrote: > Which version of Spark are you using? > > TD > > > On Sat, Feb 22, 2014 at 3:15 PM, Fabrizio Milo aka misto > <mistob...@gmail.com> wrote: >> >> Well it turns out you can use the takeOrdered function and create your >> own Compare object >> >> object AceScoreOrdering extends Ordering[Record] { >> def compare(a:Record, b:Record) = a.score.ace_score compare >> b.score.ace_score >> } >> >> val collected = dataset.takeOrdered(topN)(AceScoreOrdering) >> >> and that is what I really wanted but now for some reason I am >> getting this error: >> >> >> 14/02/22 09:11:53 ERROR actor.OneForOneStrategy: >> scala.collection.immutable.Nil$ cannot be cast to >> org.apache.spark.util.BoundedPriorityQueue >> java.lang.ClassCastException: scala.collection.immutable.Nil$ cannot >> be cast to org.apache.spark.util.BoundedPriorityQueue >> at org.apache.spark.rdd.RDD$$anonfun$top$2.apply(RDD.scala:941) >> at org.apache.spark.rdd.RDD$$anonfun$8.apply(RDD.scala:727) >> at org.apache.spark.rdd.RDD$$anonfun$8.apply(RDD.scala:724) >> at org.apache.spark.scheduler.JobWaiter.taskSucceeded(JobWaiter.scala:56) >> at >> org.apache.spark.scheduler.DAGScheduler.handleTaskCompletion(DAGScheduler.scala:843) >> at >> org.apache.spark.scheduler.DAGScheduler.processEvent(DAGScheduler.scala:598) >> at >> org.apache.spark.scheduler.DAGScheduler$$anonfun$start$1$$anon$2$$anonfun$receive$1.applyOrElse(DAGScheduler.scala:190) >> at akka.actor.ActorCell.receiveMessage(ActorCell.scala:498) >> at akka.actor.ActorCell.invoke(ActorCell.scala:456) >> at akka.dispatch.Mailbox.processMailbox(Mailbox.scala:237) >> at akka.dispatch.Mailbox.run(Mailbox.scala:219) >> at >> akka.dispatch.ForkJoinExecutorConfigurator$AkkaForkJoinTask.exec(AbstractDispatcher.scala:386) >> at scala.concurrent.forkjoin.ForkJoinTask.doExec(ForkJoinTask.java:260) >> at >> scala.concurrent.forkjoin.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1339) >> at >> scala.concurrent.forkjoin.ForkJoinPool.runWorker(ForkJoinPool.java:1979) >> >> On Sat, Feb 22, 2014 at 2:56 PM, Tathagata Das >> <tathagata.das1...@gmail.com> wrote: >> > You can use RDD.sortByKey to sort as well. rdd.map(x => (x, >> > x)).sortByKey(... ).map(_._1) >> > >> > Not sure if it will work, but rdd.map(x => (x, null)).sortByKey(....... >> > maybe more efficient. >> > >> > TD >> > >> > >> > >> > On Sat, Feb 22, 2014 at 2:41 PM, Fabrizio Milo aka misto >> > <mistob...@gmail.com> wrote: >> >> >> >> Hello everyone, >> >> >> >> Is it possible to parallel sort using spark ? >> >> I would expect some kind of method rdd.sort( a,b => a < b) >> >> but I can only find sortByKeys. >> >> >> >> Am I missing something ? >> >> >> >> Thanks >> >> >> >> Fabrizio >> >> -- >> >> LinkedIn: http://linkedin.com/in/fmilo >> >> Twitter: @fabmilo >> >> Github: http://github.com/Mistobaan/ >> >> ----------------------- >> >> Simplicity, consistency, and repetition - that's how you get through. >> >> (Jack Welch) >> >> Perfection must be reached by degrees; she requires the slow hand of >> >> time (Voltaire) >> >> The best way to predict the future is to invent it (Alan Kay) >> > >> > >> >> >> >> -- >> LinkedIn: http://linkedin.com/in/fmilo >> Twitter: @fabmilo >> Github: http://github.com/Mistobaan/ >> ----------------------- >> Simplicity, consistency, and repetition - that's how you get through. >> (Jack Welch) >> Perfection must be reached by degrees; she requires the slow hand of >> time (Voltaire) >> The best way to predict the future is to invent it (Alan Kay) > >
-- LinkedIn: http://linkedin.com/in/fmilo Twitter: @fabmilo Github: http://github.com/Mistobaan/ ----------------------- Simplicity, consistency, and repetition - that's how you get through. (Jack Welch) Perfection must be reached by degrees; she requires the slow hand of time (Voltaire) The best way to predict the future is to invent it (Alan Kay)