Hi

Regarding your question

1) when I run the above script, which jar is beed submitted to the yarn server 
? 

What SPARK_JAR env point to and the --jar point to are both submitted to the 
yarn server

2) It like the  spark-assembly-0.8.1-incubating-hadoop2.0.5-alpha.jar plays the 
role of client side and spark-examples-assembly-0.8.1-incubating.jar goes with 
spark runtime and examples which will be running in yarn, am I right?

The spark-assembly-0.8.1-incubating-hadoop2.0.5-alpha.jar will also go to yarn 
cluster as runtime for app jar(spark-examples-assembly-0.8.1-incubating.jar)

3) Does anyone have any similar experience ? I did lots of hadoop MR stuff and 
want follow the same logic to submit spark job. For now I can only find the 
command line way to submit spark job to yarn. I believe there is a easy way to 
integration spark in a web allocation.

You can use the yarn-client mode, you might want to take a look on 
docs/running-on-yarn.md, and probably you might want to try master branch to 
check our latest update on this part of docs. And in yarn client mode, the 
sparkcontext itself will do similar thing as what the command line is doing to 
submit a yarn job

Then to use it with java, you might want to try out JavaSparkContext instead of 
SparkContext, I don't personally run it with complicated applications. But a 
small example app did works.
        

Best Regards,
Raymond Liu

-----Original Message-----
From: John Zhao [mailto:jz...@alpinenow.com] 
Sent: Thursday, January 16, 2014 2:25 AM
To: user@spark.incubator.apache.org
Subject: Anyone know hot to submit spark job to yarn in java code?

Now I am working on a web application and  I want to  submit a spark job to 
hadoop yarn.
I have already do my own assemble and  can run it in command line by the 
following script:

export YARN_CONF_DIR=/home/gpadmin/clusterConfDir/yarn
export 
SPARK_JAR=./assembly/target/scala-2.9.3/spark-assembly-0.8.1-incubating-hadoop2.0.5-alpha.jar
./spark-class org.apache.spark.deploy.yarn.Client  --jar 
./examples/target/scala-2.9.3/spark-examples-assembly-0.8.1-incubating.jar  
--class org.apache.spark.examples.SparkPi --args yarn-standalone --num-workers 
3 --master-memory 1g --worker-memory 512m --worker-cores 1    

It works fine.
The I realized that it is hard to submit the job from a web application .Looks 
like the spark-assembly-0.8.1-incubating-hadoop2.0.5-alpha.jar or 
spark-examples-assembly-0.8.1-incubating.jar is a really big jar. I believe it 
contains everything .
So my question is :
1) when I run the above script, which jar is beed submitted to the yarn server 
? 
2) It loos like the  spark-assembly-0.8.1-incubating-hadoop2.0.5-alpha.jar 
plays the role of client side and spark-examples-assembly-0.8.1-incubating.jar 
goes with spark runtime and examples which will be running in yarn, am I right?
3) Does anyone have any similar experience ? I did lots of hadoop MR stuff and 
want follow the same logic to submit spark job. For now I can only find the 
command line way to submit spark job to yarn. I believe there is a easy way to 
integration spark in a web allocation.  


Thanks.
John.

Reply via email to