Hi,
Just a quick question:
When playing Spark with my toy code as below, I get some unexpected results.
*case class A(var a: Int) {*
* def setA() = { a = 100 }*
*}*
*val as = sc.parallelize(List(A(1), A(2))) // it is a RDD[A]*
*as.foreach(_.setA())*
*as.collect // it gives Array[this.A] = Array(A(1), A(2))*
The result expected is Array(A(100), A(100)). I am just trying to update
the content of the objects of A which reside in RDD.
1) Does the foreach do the right thing ?
2) Which is the best way to update the object in RDD, use 'map' instead ?
Thank you.
Hao
--
REN Hao
Data Engineer @ ClaraVista
Paris, France
Tel: +33 06 14 54 57 24