Are you looking for? *mapPartitions*(*func*)Similar to map, but runs separately on each partition (block) of the RDD, so *func* must be of type Iterator<T> => Iterator<U> when running on an RDD of type T.*mapPartitionsWithIndex*(*func* )Similar to mapPartitions, but also provides *func* with an integer value representing the index of the partition, so *func* must be of type (Int, Iterator<T>) => Iterator<U> when running on an RDD of type T.
On Wed, Apr 22, 2015 at 1:00 AM, MUHAMMAD AAMIR <[email protected]> wrote: > Hi Archit, > > Thanks a lot for your reply. I am using "rdd.partitions.length" to check > the number of partitions. rdd.partitions return the array of partitions. > I would like to add one more question here do you have any idea how to get > the objects in each partition ? Further is there any way to figure out > which particular partitions an object bleongs ? > > Thanks, > > On Tue, Apr 21, 2015 at 12:16 PM, Archit Thakur <[email protected] > > wrote: > >> Hi, >> >> This should work. How are you checking the no. of partitions.? >> >> Thanks and Regards, >> Archit Thakur. >> >> On Mon, Apr 20, 2015 at 7:26 PM, mas <[email protected]> wrote: >> >>> Hi, >>> >>> I aim to do custom partitioning on a text file. I first convert it into >>> pairRDD and then try to use my custom partitioner. However, somehow it is >>> not working. My code snippet is given below. >>> >>> val file=sc.textFile(filePath) >>> val locLines=file.map(line => line.split("\t")).map(line=> >>> ((line(2).toDouble,line(3).toDouble),line(5).toLong)) >>> val ck=locLines.partitionBy(new HashPartitioner(50)) // new >>> CustomPartitioner(50) -- none of the way is working here. >>> >>> while reading the file using "textFile" method it automatically >>> partitions >>> the file. However when i explicitly want to partition the new rdd >>> "locLines", It doesn't appear to do anything and even the number of >>> partitions are same which is created by sc.textFile(). >>> >>> Any help in this regard will be highly appreciated. >>> >>> >>> >>> >>> -- >>> View this message in context: >>> http://apache-spark-user-list.1001560.n3.nabble.com/Custom-Partitioning-Spark-tp22571.html >>> Sent from the Apache Spark User List mailing list archive at Nabble.com. >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: [email protected] >>> For additional commands, e-mail: [email protected] >>> >>> >> > > > -- > Regards, > Muhammad Aamir > > > *CONFIDENTIALITY:This email is intended solely for the person(s) named and > may be confidential and/or privileged.If you are not the intended > recipient,please delete it,notify me and do not copy,use,or disclose its > content.* > -- Best Regards, Ayan Guha
