Merge pull request #201 from rxin/mappartitions Use the proper partition index in mapPartitionsWIthIndex
mapPartitionsWithIndex uses TaskContext.partitionId as the partition index. TaskContext.partitionId used to be identical to the partition index in a RDD. However, pull request #186 introduced a scenario (with partition pruning) that the two can be different. This pull request uses the right partition index in all mapPartitionsWithIndex related calls. Also removed the extra MapPartitionsWIthContextRDD and put all the mapPartitions related functionality in MapPartitionsRDD. Project: http://git-wip-us.apache.org/repos/asf/incubator-spark/repo Commit: http://git-wip-us.apache.org/repos/asf/incubator-spark/commit/14bb465b Tree: http://git-wip-us.apache.org/repos/asf/incubator-spark/tree/14bb465b Diff: http://git-wip-us.apache.org/repos/asf/incubator-spark/diff/14bb465b Branch: refs/heads/master Commit: 14bb465bb3d65f5b1034ada85cfcad7460034073 Parents: eb4296c e9ff13e Author: Matei Zaharia <[email protected]> Authored: Mon Nov 25 18:50:18 2013 -0800 Committer: Matei Zaharia <[email protected]> Committed: Mon Nov 25 18:50:18 2013 -0800 ---------------------------------------------------------------------- .../org/apache/spark/rdd/MapPartitionsRDD.scala | 10 ++--- .../spark/rdd/MapPartitionsWithContextRDD.scala | 41 -------------------- .../main/scala/org/apache/spark/rdd/RDD.scala | 39 +++++++++---------- .../org/apache/spark/CheckpointSuite.scala | 2 - 4 files changed, 22 insertions(+), 70 deletions(-) ----------------------------------------------------------------------
