Statements are executed only when you try to cause some effect on the
server (produce data, collect data on driver). At time of execution Spark
does all the depedency resolution truncates paths that dont go anywhere
as well as optimize execution pipelines. So you really dont have to worry
about
Hello friends:
I have a theory question about call blocking in a Spark driver.
Consider this (admittedly contrived =:)) snippet to illustrate this question...
x = rdd01.reduceByKey() # or maybe some other 'shuffle-requiring action'.
b = sc.broadcast(x. take(20)) # Or any statement that