Hi Xiao,
Thank you very much for the pointers. I looked into the part of the code. I now
understand how the main method is invoked. Still not clear how is the code
distributed to the executors. Is it the whole jar or some serialized object. I
was expecting to see the part of the code where the closures are serialized and
shipped. Maybe I am missing something.
Thanks again, Arijit
Date: Thu, 8 Oct 2015 10:26:55 -0700
Subject: Re: Understanding code/closure shipment to Spark workers
From: [email protected]
To: [email protected]
CC: [email protected]
Hi, Arijit,
The code flow of spark-submit is simple.
Enter the main function of SparkSubmit.scala --> case
SparkSubmitAction.SUBMIT => submit(appArgs) --> doRunMain() in function
submit() in the same file
--> runMain(childArgs,...) in the same file --> mainMethod.invoke(null,
childArgs.toArray) in the same file
Function Invoke() is provided by JAVA Reflection for invoking the main function
of your JAR.
Hopefully, it can help you understand the problem.
Thanks,
Xiao Li
2015-10-07 16:47 GMT-07:00 Arijit <[email protected]>:
Hi,
I want to understand the code flow starting from the Spark jar that I submit
through spark-submit, how does Spark identify and extract the closures, clean
and serialize them and ship them to workers to execute as tasks. Can someone
point me to any documentation or a pointer to the source code path to help me
understand this.
Thanks, Arijit