You can implement your own partitioner based on your own logic.
Yong ________________________________ From: Kristoffer Sjögren <sto...@gmail.com> Sent: Monday, March 13, 2017 9:34 AM To: user Subject: Sorted partition ranges without overlap Hi I have a RDD<byte[]> that needs to be sorted lexicographically and then processed by partition. The partitions should be split in to ranged blocks where sorted order is maintained and each partition containing sequential, non-overlapping keys. Given keys (1,2,3,4,5,6) 1. Correct - 2 partition = (1,2,3),(4,5,6). - 3 partition = (1,2),(3,4),(5,6) 2. Incorrect, the ranges overlap even though they're sorted. - 2 partitions (1,3,5) (2,4,6) - 3 partitions (1,3),(2,5),(4,6) Is this possible with spark? Cheers, -Kristoffer --------------------------------------------------------------------- To unsubscribe e-mail: user-unsubscr...@spark.apache.org