Hi
Regarding your question
1) when I run the above script, which jar is beed submitted to the yarn server
?
What SPARK_JAR env point to and the --jar point to are both submitted to the
yarn server
2) It like the spark-assembly-0.8.1-incubating-hadoop2.0.5-alpha.jar plays the
role of client side and spark-examples-assembly-0.8.1-incubating.jar goes with
spark runtime and examples which will be running in yarn, am I right?
The spark-assembly-0.8.1-incubating-hadoop2.0.5-alpha.jar will also go to yarn
cluster as runtime for app jar(spark-examples-assembly-0.8.1-incubating.jar)
3) Does anyone have any similar experience ? I did lots of hadoop MR stuff and
want follow the same logic to submit spark job. For now I can only find the
command line way to submit spark job to yarn. I believe there is a easy way to
integration spark in a web allocation.
You can use the yarn-client mode, you might want to take a look on
docs/running-on-yarn.md, and probably you might want to try master branch to
check our latest update on this part of docs. And in yarn client mode, the
sparkcontext itself will do similar thing as what the command line is doing to
submit a yarn job
Then to use it with java, you might want to try out JavaSparkContext instead of
SparkContext, I don't personally run it with complicated applications. But a
small example app did works.
Best Regards,
Raymond Liu
-----Original Message-----
From: John Zhao [mailto:[email protected]]
Sent: Thursday, January 16, 2014 2:25 AM
To: [email protected]
Subject: Anyone know hot to submit spark job to yarn in java code?
Now I am working on a web application and I want to submit a spark job to
hadoop yarn.
I have already do my own assemble and can run it in command line by the
following script:
export YARN_CONF_DIR=/home/gpadmin/clusterConfDir/yarn
export
SPARK_JAR=./assembly/target/scala-2.9.3/spark-assembly-0.8.1-incubating-hadoop2.0.5-alpha.jar
./spark-class org.apache.spark.deploy.yarn.Client --jar
./examples/target/scala-2.9.3/spark-examples-assembly-0.8.1-incubating.jar
--class org.apache.spark.examples.SparkPi --args yarn-standalone --num-workers
3 --master-memory 1g --worker-memory 512m --worker-cores 1
It works fine.
The I realized that it is hard to submit the job from a web application .Looks
like the spark-assembly-0.8.1-incubating-hadoop2.0.5-alpha.jar or
spark-examples-assembly-0.8.1-incubating.jar is a really big jar. I believe it
contains everything .
So my question is :
1) when I run the above script, which jar is beed submitted to the yarn server
?
2) It loos like the spark-assembly-0.8.1-incubating-hadoop2.0.5-alpha.jar
plays the role of client side and spark-examples-assembly-0.8.1-incubating.jar
goes with spark runtime and examples which will be running in yarn, am I right?
3) Does anyone have any similar experience ? I did lots of hadoop MR stuff and
want follow the same logic to submit spark job. For now I can only find the
command line way to submit spark job to yarn. I believe there is a easy way to
integration spark in a web allocation.
Thanks.
John.