Hi Phil I guess for() is executed on the driver while foreach() will execute it in parallel. You can try this without collecting the rdd try both . foreach in this case would print on executors and you would not see anything on the driver console. Thanks Deepak
On Tue, Jul 12, 2016 at 9:28 PM, philipghu <philguang...@gmail.com> wrote: > Hi, > > I'm new to Spark and Scala as well. I understand that we can use foreach to > apply a function to each element of an RDD, like rdd.foreach > (x=>println(x)), but I saw we can also do a for loop to print each element > of an RDD, like > > for (x <- rdd){ > println(x) > } > > Does defining the foreach function in RDD make an RDD traversable like > this? > Does the compiler automatically invoke the foreach function when it sees a > for loop? > > > Thanks! > Phil > > > > > > -- > View this message in context: > http://apache-spark-user-list.1001560.n3.nabble.com/RDD-for-loop-vs-foreach-tp27326.html > Sent from the Apache Spark User List mailing list archive at Nabble.com. > > --------------------------------------------------------------------- > To unsubscribe e-mail: user-unsubscr...@spark.apache.org > > -- Thanks Deepak www.bigdatabig.com www.keosha.net