The hadoop conf dir is what controls which YARN cluster it goes to so its a matter of putting in the correct configs for the cluster you want it to go to.
You have to execute the org.apache.spark.deploy.yarn.Client or your application will not run on yarn in standalone mode. The client is what has the logic to submit it to yarn and start it under yarn. Your application code just gets started in a thread under the YARN application master. if you export SPARK_PRINT_LAUNCH_COMMAND=1 when you run the spark-class command you still the java command it executes. Note that the spark on yarn standalone (yarn-standalone) model is more of a batch mode where you are expected to submit your pre-defined application, it runs for a certain (relatively short) period, and then it exits. Its not really for long lived things, interactive querying, or the shark server model where you submit multiple things to the same spark context. In the 0.8.1 release there is a client mode for yarn that will let you run spark shell and may fit your use case better. https://github.com/apache/incubator-spark/blob/branch-0.8/docs/running-on-yarn.md - look at the yarn-client mode. Tom On Monday, December 16, 2013 10:02 AM, "Karavany, Ido" <[email protected]> wrote: Hi All, We’ve started with deploying spark on Hadoop 2 and Yarn. Our previous configuration (still not a production cluster) was Spark on Mesos. We’re running a java application (which runs from tomcat server). The application builds a singleton java spark context when it is first lunch and then all users’ requests are executed using this same spark context. With Mesos – creating the context included few simple operation and was possible via the java application. I successfully executed Spark and Yarn example and even my own example (although I was unable to find the output logs) I noticed that it is being done using org.apache.spark.deploy.yarn.Client but have no example regarding how it can be done. Successful command: SPARK_JAR=/app/spark-0.8.0-incubating/assembly/target/scala-2.9.3/spark-assembly-0.8.0-incubating-hadoop2.0.4-Intel.jar ./spark-class org.apache.spark.deploy.yarn.Client --jar /app/iot/test/test3-0.0.1-SNAPSHOT.jar --class test3.yarntest --args yarn-standalone --num-workers 3 --master-memory 4g --worker-memory 2g --worker-cores When I try to emulate the previous method we used and simple execute my test jar - the execution hangs. Our main goal is to be able to execute spark context on yarn from java code (and not shell script) and create a singleton spark context. In addition the application should be executed on a remote YARN server and not on a local one. Can you please advice? Thanks, Ido Problematic Command: /usr/java/latest/bin/java -cp /usr/lib/hbase/hbase-0.94.7-Intel.jar:/usr/lib/hadoop/hadoop-auth-2.0.4-Intel.jar:/usr/lib/hadoop/lib/commons-cli-1.2.jar:/app/spark-0.8.0-incubating/conf:/app/spark-0.8.0-incubating/assembly/target/scala-2.9.3/spark-assembly-0.8.0-incubating-hadoop2.0.4-Intel.jar:/etc/hadoop/conf:/etc/hbase/conf:/etc/hadoop/conf:/app/iot/test/test3-0.0.1-SNAPSHOT.jar -Djava.library.path=/usr/lib/hadoop/lib/native -Xms512m -Xmx512m test3.yarntest Spark Context code piece: JavaSparkContext sc = new JavaSparkContext( "yarn-standalone", "SPARK YARN TEST" ); Log: 13/12/12 17:30:36 INFO slf4j.Slf4jEventHandler: Slf4jEventHandler started 13/12/12 17:30:36 INFO spark.SparkEnv: Registering BlockManagerMaster 13/12/12 17:30:36 INFO storage.MemoryStore: MemoryStore started with capacity 323.9 MB. 13/12/12 17:30:36 INFO storage.DiskStore: Created local directory at /tmp/spark-local-20131212173036-09c0 13/12/12 17:30:36 INFO network.ConnectionManager: Bound socket to port 39426 with id = ConnectionManagerId(ip-172-31-43-121.eu-west-1.compute.internal,39426) 13/12/12 17:30:36 INFO storage.BlockManagerMaster: Trying to register BlockManager 13/12/12 17:30:36 INFO storage.BlockManagerMaster: Registered BlockManager 13/12/12 17:30:37 INFO server.Server: jetty-7.x.y-SNAPSHOT 13/12/12 17:30:37 INFO server.AbstractConnector: Started [email protected]:43438 13/12/12 17:30:37 INFO broadcast.HttpBroadcast: Broadcast server started at http://172.31.43.121:43438 13/12/12 17:30:37 INFO spark.SparkEnv: Registering MapOutputTracker 13/12/12 17:30:37 INFO spark.HttpFileServer: HTTP File server directory is /tmp/spark-b48abc5a-53c6-4af1-9c3c-725e1cd7fbb9 13/12/12 17:30:37 INFO server.Server: jetty-7.x.y-SNAPSHOT 13/12/12 17:30:37 INFO server.AbstractConnector: Started [email protected]:60476 13/12/12 17:30:37 INFO server.Server: jetty-7.x.y-SNAPSHOT 13/12/12 17:30:37 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/storage/rdd,null} 13/12/12 17:30:37 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/storage,null} 13/12/12 17:30:37 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/stages/stage,null} 13/12/12 17:30:37 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/stages/pool,null} 13/12/12 17:30:37 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/stages,null} 13/12/12 17:30:37 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/environment,null} 13/12/12 17:30:37 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/executors,null} 13/12/12 17:30:37 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/metrics/json,null} 13/12/12 17:30:37 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/static,null} 13/12/12 17:30:37 INFO handler.ContextHandler: started o.e.j.s.h.ContextHandler{/,null} 13/12/12 17:30:37 INFO server.AbstractConnector: Started [email protected]:4040 13/12/12 17:30:37 INFO ui.SparkUI: Started Spark Web UI at http://ip-172-31-43-121.eu-west-1.compute.internal:4040 13/12/12 17:30:37 INFO cluster.YarnClusterScheduler: Created YarnClusterScheduler 13/12/12 17:30:37 INFO yarn.ApplicationMaster$$anon$1: Adding shutdown hook for context org.apache.spark.SparkContext@475a07bf --------------------------------------------------------------------- Intel Electronics Ltd. This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
