Andrew, Brilliant! I built on Java 7 but was still running our cluster on Java 6. Upgraded the cluster and it worked (with slight tweaks to the args, I guess the app args come first then yarn-standalone comes last):
SPARK_JAR=./assembly/target/scala-2.10/spark-assembly-1.0.0-SNAPSHOT-hadoop2.3.0-cdh5.0.0.jar \ ./bin/spark-class org.apache.spark.deploy.yarn.Client \ --jar examples/target/scala-2.10/spark-examples-1.0.0-SNAPSHOT-hadoop2.3.0-cdh5.0.0.jar \ --class org.apache.spark.examples.SparkPi \ --args 10 \ --args yarn-standalone \ --num-workers 3 \ --master-memory 4g \ --worker-memory 2g \ --worker-cores 1 I'll make sure to use spark-submit from here on out. Thanks very much! Jon On Thu, May 22, 2014 at 12:40 PM, Andrew Or <and...@databricks.com> wrote: > Hi Jon, > > Your configuration looks largely correct. I have very recently confirmed > that the way you launch SparkPi also works for me. > > I have run into the same problem a bunch of times. My best guess is that > this is a Java version issue. If the Spark assembly jar is built with Java > 7, it cannot be opened by Java 6 because the two versions use different > packaging schemes. This is a known issue: > https://issues.apache.org/jira/browse/SPARK-1520. > > The workaround is to either make sure that all your executor nodes are > running Java 7, and, very importantly, have JAVA_HOME point to this > version. You can achieve this through > > export SPARK_YARN_USER_ENV="JAVA_HOME=/path/to/java7/home" > > in spark-env.sh. Another safe alternative, of course, is to just build the > jar with Java 6. An additional debugging step is to review the launch > environment of all the containers. This is detailed in the last paragraph > of this section: > http://people.apache.org/~pwendell/spark-1.0.0-rc7-docs/running-on-yarn.html#debugging-your-application. > This may not be necessary, but I have personally found it immensely useful. > > One last thing, launching Spark applications through > org.apache.spark.deploy.yarn.Client is deprecated in Spark 1.0. You should > use bin/spark-submit instead. You can find information about its usage on > the docs I linked to you, or simply through the --help option. > > Cheers, > Andrew > > > 2014-05-22 11:38 GMT-07:00 Jon Bender <jonathan.ben...@gmail.com>: > > Hey all, >> >> I'm working through the basic SparkPi example on a YARN cluster, and i'm >> wondering why my containers don't pick up the spark assembly classes. >> >> I built the latest spark code against CDH5.0.0 >> >> Then ran the following: >> SPARK_JAR=./assembly/target/scala-2.10/spark-assembly-1.0.0-SNAPSHOT-hadoop2.3.0-cdh5.0.0.jar >> \ >> ./bin/spark-class org.apache.spark.deploy.yarn.Client \ >> --jar >> examples/target/scala-2.10/spark-examples-1.0.0-SNAPSHOT-hadoop2.3.0-cdh5.0.0.jar >> \ >> --class org.apache.spark.examples.SparkPi \ >> --args yarn-standalone \ >> --num-workers 3 \ >> --master-memory 4g \ >> --worker-memory 2g \ >> --worker-cores 1 >> >> The job dies, and in the stderr from the containers I see >> Exception in thread "main" java.lang.NoClassDefFoundError: >> org/apache/spark/deploy/yarn/ApplicationMaster >> Caused by: java.lang.ClassNotFoundException: >> org.apache.spark.deploy.yarn.ApplicationMaster >> at java.net.URLClassLoader$1.run(URLClassLoader.java:217) >> at java.security.AccessController.doPrivileged(Native Method) >> at java.net.URLClassLoader.findClass(URLClassLoader.java:205) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:321) >> at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:294) >> at java.lang.ClassLoader.loadClass(ClassLoader.java:266) >> >> my yarn-site.xml contains the following classpath: >> <property> >> <name>yarn.application.classpath</name> >> <value> >> /etc/hadoop/conf/, >> /usr/lib/hadoop/*,/usr/lib/hadoop//lib/*, >> /usr/lib/hadoop-hdfs/*,/user/lib/hadoop-hdfs/lib/*, >> /usr/lib/hadoop-mapreduce/*,/usr/lib/hadoop-mapreduce/lib/*, >> /usr/lib/hadoop-yarn/*,/usr/lib/hadoop-yarn/lib/*, >> /usr/lib/avro/* >> </value> >> </property> >> >> I've confirmed that the spark-assembly JAR has this class. Does it >> actually need to be defined in yarn.application.classpath or should the >> spark client take care of ensuring the necessary JARs are added during job >> submission? >> >> Any tips would be greatly appreciated! >> Cheers, >> Jon >> > >