Re: Partition n keys into exacly n partitions

2016-09-13 Thread Christophe Préaud
Hi, A custom partitioner is indeed the solution. Here is a sample code: import org.apache.spark.Partitioner class KeyPartitioner(keyList: Seq[Any]) extends Partitioner { def numPartitions: Int = keyList.size + 1 def getPartition(key: Any): Int = keyList.indexOf(key) + 1 override def

Re: Partition n keys into exacly n partitions

2016-09-12 Thread Denis Bolshakov
Just provide own partitioner. One I wrote a partitioner which keeps similar keys together in one partitioner. Best regards, Denis On 12 September 2016 at 19:44, sujeet jog wrote: > Hi, > > Is there a way to partition set of data with n keys into exactly n > partitions.

Partition n keys into exacly n partitions

2016-09-12 Thread sujeet jog
Hi, Is there a way to partition set of data with n keys into exactly n partitions. For ex : - tuple of 1008 rows with key as x tuple of 1008 rows with key as y and so on total 10 keys ( x, y etc ) Total records = 10080 NumOfKeys = 10 i want to partition the 10080 elements into exactly 10