Re: configure spark history server for running on Yarn

Tom Graves Mon, 05 May 2014 07:11:45 -0700

Since 1.0 is still in development you can pick up the latest docs in git: 
https://github.com/apache/spark/tree/branch-1.0/docs


I didn't see anywhere that you said you started the spark history server?

there are multiple things that need to happen for the spark history server to 
work.

1) configure your application to save the history logs - see the eventLog 
settings here 
https://github.com/apache/spark/blob/branch-1.0/docs/configuration.md

2) On yarn -  know the host/port where you are going to start the spark history 
server and configure: spark.yarn.historyServer.address to point to it.  Note 
that this purely makes the link from the ResourceManager UI properly point to 
the Spark History Server Daemon.

3) Start the spark history server pointing to the same directory as specified 
in your application (spark.eventLog.dir)

4) run your application. once it finishes then you can either go to the RM UI 
to link to the spark history UI or go directly to the spark history server ui.

Tom
On Thursday, May 1, 2014 7:09 PM, Jenny Zhao <linlin200...@gmail.com> wrote:
 
Hi,

I have installed spark 1.0 from the branch-1.0, build went fine, and I have 
tried running the example on Yarn client mode, here is my command: 

/home/hadoop/spark-branch-1.0/bin/spark-submit 
/home/hadoop/spark-branch-1.0/examples/target/scala-2.10/spark-examples-1.0.0-hadoop2.2.0.jar
 --master yarn --deploy-mode client --executor-memory 6g --executor-cores 3 
--driver-memory 3g --name SparkPi --num-executors 2 --class 
org.apache.spark.examples.SparkPi yarn-client 5

after the run, I was not being able to retrieve the log from Yarn's web UI, 
while I have tried to specify the history server in spark-env.sh 

export SPARK_DAEMON_JAVA_OPTS="-Dspark.yarn.historyServer.address=master:18080"


I also tried to specify it in spark-defaults.conf, doesn't work as well, I 
would appreciate if someone can tell me what is the way of specifying it either 
in spark-env.sh or spark-defaults.conf, so that this option can be applied to 
any spark application. 


another thing I found is the usage output for spark-submit is not complete/not 
in sync with the online documentation, hope it is addressed with the formal 
release. 

and is this the latest documentation for spark 1.0? 
http://people.csail.mit.edu/matei/spark-unified-docs/running-on-yarn.html

Thank you!

Re: configure spark history server for running on Yarn

Reply via email to