How about using mapToPair and exchanging the two. Will it be efficient
Below is the code , will it be efficient to convert like this.


JavaPairRDD<Long, MatcherReleventData> RddForMarch
=matchRdd.zipWithindex.mapToPair(new
PairFunction<Tuple2<VendorRecord,Long>, Long, MatcherReleventData>() {

@Override
public Tuple2<Long, MatcherReleventData> call(Tuple2<VendorRecord, Long> t)
throws Exception {
MatcherReleventData matcherData = new MatcherReleventData();
Tuple2<Long, MatcherReleventData> tuple = new Tuple2<Long,
MatcherReleventData>(t._2,
matcherData.convertVendorDataToMatcherData(t._1));
 return tuple;
}

}).cache();

On 13 April 2015 at 03:11, Ted Yu <yuzhih...@gmail.com> wrote:

> Please also take a look at ZippedWithIndexRDDPartition which is 72 lines
> long.
>
> You can create your own version which extends RDD[(Long, T)]
>
> Cheers
>
> On Sun, Apr 12, 2015 at 1:29 PM, Ted Yu <yuzhih...@gmail.com> wrote:
>
>> bq. will return something like JavaPairRDD<Object, long>
>>
>> The long component of the pair fits your description of index. What other
>> requirement does ZipWithIndex not provide you ?
>>
>> Cheers
>>
>> On Sun, Apr 12, 2015 at 1:16 PM, Jeetendra Gangele <gangele...@gmail.com>
>> wrote:
>>
>>> Hi All I have an RDD JavaRDD<Object> and I want to convert it to
>>> JavaPairRDD<Index,Object>.. Index should be unique and it should maintain
>>> the order. For first object It should have 1 and then for second 2 like
>>> that.
>>>
>>> I tried using ZipWithIndex but it will return something like
>>> JavaPairRDD<Object, long>
>>> I wanted to use this RDD for lookup and join operation later in my
>>> workflow so ordering is important.
>>>
>>>
>>> Regards
>>> jeet
>>>
>>
>>
>

Reply via email to