Say I have a main method with the following pseudo-code (to be run on a spark
standalone cluster):
main(args) {
  RDD rdd
  rdd1 = rdd.map(...)
  // some other statements not using RDD
  rdd2 = rdd.filter(...)
}

When executed, will each of the two statements involving RDDs (map and
filter) be individually partitioned and distributed on available cluster
nodes? And any statements not involving RDDs (or data frames) will typically
be executed on the driver?
Is that how spark take advantage of the cluster?



--
Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/

---------------------------------------------------------------------
To unsubscribe e-mail: user-unsubscr...@spark.apache.org

Reply via email to