No, the index referred to in mapWith (as well as in mapPartitionsWithIndex
and several other RDD methods) is the index of the RDD's partitions.  So,
for example, in a typical case of an RDD read in from a distributed
filesystem where the input file occupies n blocks, the index values in
mapWith will range from 0 to n-1, since the default is for one RDD
partition to be created for each file block.


On Tue, Dec 24, 2013 at 7:14 AM, Aureliano Buendia <[email protected]>wrote:

> Hi,
>
> Given a distributed file, does mapWith provide the functionality to know
> the index of each line (line number -1) across all worker nodes?
>
> Can mapWith be used to treat index as a key when joining two RDD?
>

Reply via email to