my guess is you need to use a map for this. foreach is for side-effects and
i am not sure if changing the object itself is an expected use. also, the
objects are supposed to be immutable, your's isn't.


On Tue, Nov 5, 2013 at 4:40 PM, Hao REN <[email protected]> wrote:

> Hi,
>
> Just a quick question:
>
> When playing Spark with my toy code as below, I get some unexpected
> results.
>
>
> *case class A(var a: Int) {*
> *    def setA() = { a = 100 }*
> *}*
>
> *val as = sc.parallelize(List(A(1), A(2)))   // it is a RDD[A]*
>
>
> *as.foreach(_.setA())*
>
> *as.collect  // it gives Array[this.A] = Array(A(1), A(2))*
>
>
> The result expected is Array(A(100), A(100)). I am just trying to update
> the content of the objects of A which reside in RDD.
>
> 1) Does the foreach do the right thing ?
> 2) Which is the best way to update the object in RDD, use 'map' instead ?
>
> Thank you.
>
> Hao
>
> --
> REN Hao
>
> Data Engineer @ ClaraVista
>
> Paris, France
>
> Tel:  +33 06 14 54 57 24
>

Reply via email to