The direct answere you are looking for may be in RDD.mapPartitionsWithIndex()
The better question is, why are you looking into only the 3rd partition? To analyze a random sample? Then look into RDD.sample(). Are you sure the data you are looking for is in the 3rd partition? What if you end up with only 2 partitions after loading your data? Or you may want to filter() your RDD? Adrian Tim Chou wrote > Hi All, > > I use textFile to create a RDD. However, I don't want to handle the whole > data in this RDD. For example, maybe I only want to solve the data in 3rd > partition of the RDD. > > How can I do it? Here are some possible solutions that I'm thinking: > 1. Create multiple RDDs when reading the file > 2. Run MapReduce functions with the specific partition for an RDD. > > However, I cannot find any appropriate function. > > Thank you and wait for your suggestions. > > Best, > Tim -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-How-can-I-run-MapReduce-only-on-one-partition-in-an-RDD-tp18882p18884.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org