subject:"taking top k values of rdd"

Re: taking top k values of rdd

2014-07-05 Thread Nick Pentreath

On Sat, Jul 5, 2014 at 10:17 AM, Koert Kuipers ko...@tresata.com wrote: my initial approach to taking top k values of a rdd was using a priority-queue monoid. along these lines: rdd.mapPartitions({ items = Iterator.single(new PriorityQueue(...)) }, false).reduce(monoid.plus) this works fine

Re: taking top k values of rdd

2014-07-05 Thread Koert Kuipers

. On the driver you can just top k the combined top k from each partition (assuming you have (object, count) for each top k list). — Sent from Mailbox https://www.dropbox.com/mailbox On Sat, Jul 5, 2014 at 10:17 AM, Koert Kuipers ko...@tresata.com wrote: my initial approach to taking top k values

Re: taking top k values of rdd

2014-07-05 Thread Nick Pentreath

, 2014 at 10:17 AM, Koert Kuipers ko...@tresata.com wrote: my initial approach to taking top k values of a rdd was using a priority-queue monoid. along these lines: rdd.mapPartitions({ items = Iterator.single(new PriorityQueue(...)) }, false).reduce(monoid.plus) this works fine, but looking

Re: taking top k values of rdd

Re: taking top k values of rdd

Re: taking top k values of rdd

3 matches

Site Navigation

Mail list logo

Footer information