Hi All,

I wanted to launch Spark on Yarn, interactive - yarn client mode.

With default settings of yarn-site.xml and spark-env.sh, i followed the
given link
http://spark.apache.org/docs/0.8.1/running-on-yarn.html

I get the pi value correct when i run without launching the shell.

When i launch the shell, with following command,

SPARK_JAR=./assembly/target/scala-2.9.3/spark-assembly-0.8.1-incubating-hadoop2.3.0.jar
\
SPARK_YARN_APP_JAR=examples/target/scala-2.9.3/spark-examples-assembly-0.8.1-incubating.jar
\
MASTER=yarn-client ./spark-shell

And try to create RDDs and do some action on it, i get nothing. After
sometime tasks fails.

LogFile of spark:

519095 14/05/12 13:30:40 INFO YarnClientClusterScheduler:
YarnClientClusterScheduler.postStartHook done

519096 14/05/12 13:30:40 INFO BlockManagerMasterActor$BlockManagerInfo:
Registering block manager s1:38355 with 324.4 MB RAM

519097 14/05/12 13:31:38 INFO MemoryStore: ensureFreeSpace(202584) called
with curMem=0, maxMem=340147568

519098 14/05/12 13:31:38 INFO MemoryStore: Block broadcast_0 stored as
values to memory (estimated size 197.8 KB, free 324.2 MB)

519099 14/05/12 13:31:49 INFO FileInputFormat: Total input paths to process
: 1

519100 14/05/12 13:31:49 INFO NetworkTopology: Adding a new node:
/default-rack/192.168.1.100:50010

519101 14/05/12 13:31:49 INFO SparkContext: Starting job: top at
<console>:15

519102 14/05/12 13:31:49 INFO DAGScheduler: Got job 0 (top at <console>:15)
with 4 output partitions (allowLocal=false)

519103 14/05/12 13:31:49 INFO DAGScheduler: Final stage: Stage 0 (top at
<console>:15)

519104 14/05/12 13:31:49 INFO DAGScheduler: Parents of final stage: List()

519105 14/05/12 13:31:49 INFO DAGScheduler: Missing parents: List()

519106 14/05/12 13:31:49 INFO DAGScheduler: Submitting Stage 0
(MapPartitionsRDD[2] at top at <console>:15), which has no missing par
      ents

519107 14/05/12 13:31:49 INFO DAGScheduler: Submitting 4 missing tasks from
Stage 0 (MapPartitionsRDD[2] at top at <console>:15)

519108 14/05/12 13:31:49 INFO YarnClientClusterScheduler: Adding task set
0.0 with 4 tasks

519109 14/05/12 13:31:49 INFO *RackResolver: Resolved s1 to /default-rack*

*519110 14/05/12 13:31:49 INFO ClusterTaskSetManager: Starting task 0.0:3
as TID 0 on executor 1: s1 (PROCESS_LOCAL)*

*519111 14/05/12 13:31:49 INFO ClusterTaskSetManager: Serialized task 0.0:3
as 1811 bytes in 4 ms*

*519112 14/05/12 13:31:49 INFO ClusterTaskSetManager: Starting task 0.0:0
as TID 1 on executor 1: s1 (NODE_LOCAL)*

*519113 14/05/12 13:31:49 INFO ClusterTaskSetManager: Serialized task 0.0:0
as 1811 bytes in 1 ms*

519114 14/05/12 13:32:18* INFO YarnClientSchedulerBackend: Executor 1
disconnected, so removing it*

*519115 14/05/12 13:32:18 ERROR YarnClientClusterScheduler: Lost executor 1
on s1: remote Akka client shutdown*

*519116 14/05/12 13:32:18 INFO ClusterTaskSetManager: Re-queueing tasks for
1 from TaskSet 0.0*

*519117 14/05/12 13:32:18 WARN ClusterTaskSetManager: Lost TID 1 (task
0.0:0)*

*519118 14/05/12 13:32:18 WARN ClusterTaskSetManager: Lost TID 0 (task
0.0:3)*

*519119 14/05/12 13:32:18 INFO DAGScheduler: Executor lost: 1 (epoch 0)*

*519120 14/05/12 13:32:18 INFO BlockManagerMasterActor: Trying to remove
executor 1 from BlockManagerMaster.*

*519121 14/05/12 13:32:18 INFO BlockManagerMaster: Removed 1 successfully
in removeExecutor*


 Do i need to set any other env-variable specifically for SPARK on YARN.
What could be the isuue ??

Can anyone please help me in this regard.

Thanks in Advance !!

Reply via email to