You need to look at the logs files for yarn.  Generally this can be done with 
"yarn logs -applicationId <your_app_id>".  That only works if you have log 
aggregation enabled though.   You should be able to see atleast the application 
master logs through the yarn resourcemanager web ui.  I would try that first. 

If that doesn't work you can turn on debug in the nodemanager:

To review per-container launch environment, increase 
yarn.nodemanager.delete.debug-delay-sec to a large value (e.g. 36000), and then 
access the application cache through yarn.nodemanager.local-dirs on the nodes 
on which containers are launched. This directory contains the launch script, 
jars, and all environment variables used for launching each container. This 
process is useful for debugging classpath problems in particular. (Note that 
enabling this requires admin privileges on cluster settings and a restart of 
all node managers. Thus, this is not applicable to hosted clusters).



Tom


On Monday, May 12, 2014 9:38 AM, Sai Prasanna <ansaiprasa...@gmail.com> wrote:
 
Hi All, 

I wanted to launch Spark on Yarn, interactive - yarn client mode.

With default settings of yarn-site.xml and spark-env.sh, i followed the given 
link 
http://spark.apache.org/docs/0.8.1/running-on-yarn.html

I get the pi value correct when i run without launching the shell.

When i launch the shell, with following command,
SPARK_JAR=./assembly/target/scala-2.9.3/spark-assembly-0.8.1-incubating-hadoop2.3.0.jar
 \
SPARK_YARN_APP_JAR=examples/target/scala-2.9.3/spark-examples-assembly-0.8.1-incubating.jar
 \
MASTER=yarn-client ./spark-shell
And try to create RDDs and do some action on it, i get nothing. After sometime 
tasks fails.

LogFile of spark: 
519095 14/05/12 13:30:40 INFO YarnClientClusterScheduler: 
YarnClientClusterScheduler.postStartHook done
519096 14/05/12 13:30:40 INFO BlockManagerMasterActor$BlockManagerInfo: 
Registering block manager s1:38355 with 324.4 MB RAM
519097 14/05/12 13:31:38 INFO MemoryStore: ensureFreeSpace(202584) called with 
curMem=0, maxMem=340147568
519098 14/05/12 13:31:38 INFO MemoryStore: Block broadcast_0 stored as values 
to memory (estimated size 197.8 KB, free 324.2 MB)
519099 14/05/12 13:31:49 INFO FileInputFormat: Total input paths to process : 1
519100 14/05/12 13:31:49 INFO NetworkTopology: Adding a new node: 
/default-rack/192.168.1.100:50010
519101 14/05/12 13:31:49 INFO SparkContext: Starting job: top at <console>:15
519102 14/05/12 13:31:49 INFO DAGScheduler: Got job 0 (top at <console>:15) 
with 4 output partitions (allowLocal=false)
519103 14/05/12 13:31:49 INFO DAGScheduler: Final stage: Stage 0 (top at 
<console>:15)
519104 14/05/12 13:31:49 INFO DAGScheduler: Parents of final stage: List()
519105 14/05/12 13:31:49 INFO DAGScheduler: Missing parents: List()
519106 14/05/12 13:31:49 INFO DAGScheduler: Submitting Stage 0 
(MapPartitionsRDD[2] at top at <console>:15), which has no missing par       
ents
519107 14/05/12 13:31:49 INFO DAGScheduler: Submitting 4 missing tasks from 
Stage 0 (MapPartitionsRDD[2] at top at <console>:15)
519108 14/05/12 13:31:49 INFO YarnClientClusterScheduler: Adding task set 0.0 
with 4 tasks
519109 14/05/12 13:31:49 INFO RackResolver: Resolved s1 to /default-rack
519110 14/05/12 13:31:49 INFO ClusterTaskSetManager: Starting task 0.0:3 as TID 
0 on executor 1: s1 (PROCESS_LOCAL)
519111 14/05/12 13:31:49 INFO ClusterTaskSetManager: Serialized task 0.0:3 as 
1811 bytes in 4 ms
519112 14/05/12 13:31:49 INFO ClusterTaskSetManager: Starting task 0.0:0 as TID 
1 on executor 1: s1 (NODE_LOCAL)
519113 14/05/12 13:31:49 INFO ClusterTaskSetManager: Serialized task 0.0:0 as 
1811 bytes in 1 ms
519114 14/05/12 13:32:18INFO YarnClientSchedulerBackend: Executor 1 
disconnected, so removing it
519115 14/05/12 13:32:18 ERROR YarnClientClusterScheduler: Lost executor 1 on 
s1: remote Akka client shutdown
519116 14/05/12 13:32:18 INFO ClusterTaskSetManager: Re-queueing tasks for 1 
from TaskSet 0.0
519117 14/05/12 13:32:18 WARN ClusterTaskSetManager: Lost TID 1 (task 0.0:0)
519118 14/05/12 13:32:18 WARN ClusterTaskSetManager: Lost TID 0 (task 0.0:3)
519119 14/05/12 13:32:18 INFO DAGScheduler: Executor lost: 1 (epoch 0)
519120 14/05/12 13:32:18 INFO BlockManagerMasterActor: Trying to remove 
executor 1 from BlockManagerMaster.
519121 14/05/12 13:32:18 INFO BlockManagerMaster: Removed 1 successfully in 
removeExecutor


 Do i need to set any other env-variable specifically for SPARK on YARN. What 
could be the isuue ??


Can anyone please help me in this regard.

Thanks in Advance !!

Reply via email to