If you use rdd.mapPartitions(), you'll be able to get a hold of the iterators for each partiton. Then you should be able to do iterator.grouped(size) on each of the partitions. I think it may mean you have 1 element at the end of each partition that may have less than "size" elements. If that's okay for you then that should work.
On Sat, Jun 20, 2015 at 7:48 PM, Brandon White <[email protected]> wrote: > How would you do a .grouped(10) on a RDD, is it possible? Here is an > example for a Scala list > > scala> List(1,2,3,4).grouped(2).toList > res1: List[List[Int]] = List(List(1, 2), List(3, 4)) > > Would like to group n elements. >
