No, it won't. The type of RDD#foreach is Unit, so it doesn't return an RDD. The utility of foreach is purely for the side effects it generates, not for its return value -- and modifying an RDD in place via foreach is generally not a very good idea.
On Mon, Mar 24, 2014 at 6:35 PM, hequn cheng <chenghe...@gmail.com> wrote: > points.foreach(p=>p.y = another_value) will return a new modified RDD. > > > 2014-03-24 18:13 GMT+08:00 Chieh-Yen <r01944...@csie.ntu.edu.tw>: > > Dear all, >> >> I have a question about the usage of RDD. >> I implemented a class called AppDataPoint, it looks like: >> >> case class AppDataPoint(input_y : Double, input_x : Array[Double]) >> extends Serializable { >> var y : Double = input_y >> var x : Array[Double] = input_x >> ...... >> } >> Furthermore, I created the RDD by the following function. >> >> def parsePoint(line: String): AppDataPoint = { >> /* Some related works for parsing */ >> ...... >> } >> >> Assume the RDD called "points": >> >> val lines = sc.textFile(inputPath, numPartition) >> var points = lines.map(parsePoint _).cache() >> >> The question is that, I tried to modify the value of this RDD, the >> operation is: >> >> points.foreach(p=>p.y = another_value) >> >> The operation is workable. >> There doesn't have any warning or error message showed by the system and >> the results are right. >> I wonder that if the modification for RDD is a correct and in fact >> workable design. >> The usage web said that the RDD is immutable, is there any suggestion? >> >> Thanks a lot. >> >> Chieh-Yen Lin >> > >