Have you looked at this method ? * Zips this RDD with its element indices. The ordering is first based on the partition index ... def zipWithIndex(): RDD[(T, Long)] = withScope {
On Wed, Jan 27, 2016 at 6:03 PM, Krishna <research...@gmail.com> wrote: > Hi, > > I've a scenario where I need to maintain state that is local to a worker > that can change during map operation. What's the best way to handle this? > > *incr = 0* > *def row_index():* > * global incr* > * incr += 1* > * return incr* > > *out_rdd = inp_rdd.map(lambda x: row_index()).collect()* > > "out_rdd" in this case only contains 1s but I would like it to have index > of each row in "inp_rdd". >