Have you looked at this method ?

   * Zips this RDD with its element indices. The ordering is first based on
the partition index
...
  def zipWithIndex(): RDD[(T, Long)] = withScope {

On Wed, Jan 27, 2016 at 6:03 PM, Krishna <research...@gmail.com> wrote:

> Hi,
>
> I've a scenario where I need to maintain state that is local to a worker
> that can change during map operation. What's the best way to handle this?
>
> *incr = 0*
> *def row_index():*
> *  global incr*
> *  incr += 1*
> *  return incr*
>
> *out_rdd = inp_rdd.map(lambda x: row_index()).collect()*
>
> "out_rdd" in this case only contains 1s but I would like it to have index
> of each row in "inp_rdd".
>

Reply via email to