I don't think that you need to copy the jar to the rest of the cluster - you should be able to do addJar() in the SparkContext and spark should automatically push the jars to the client for you.
I don't know how set you are on running code through checking out and compiling, but here's what I do instead to get my own application to run: - compile my code on my desktop and generate a jar - scp the jar to the master - modify runExample to include the jar in the classpath. I think that you can also just modify SPARK_CLASSPATH - run using something like: $ runExample my.class.name arg1 arg2 arg3 Hope this helps! Shankari On Tue, Dec 10, 2013 at 12:15 PM, Jeff Higgens <[email protected]> wrote: > I'm having trouble running my Spark program as a "fat jar" on EC2. > > This is the process I'm using: > (1) spark-ec2 script to launch cluster > (2) ssh to master, install sbt and git clone my project's source code > (3) update source to reference correct master and jar > (4) sbt assembly > (5) copy-dir to copy the jar to the rest of the cluster > > I tried both running the jar (java -jar ...) and using sbt run, but I > always end up with this error: > > 18:58:59.556 [spark-akka.actor.default-dispatcher-4] INFO > o.a.s.d.client.Client$ClientActor - Connecting to master spark:// > ec2-50-16-80-0.compute-1.amazonaws.com:7077 > 18:58:59.838 [spark-akka.actor.default-dispatcher-4] ERROR > o.a.s.d.client.Client$ClientActor - Connection to master failed; stopping > client > 18:58:59.839 [spark-akka.actor.default-dispatcher-4] ERROR > o.a.s.s.c.SparkDeploySchedulerBackend - Disconnected from Spark cluster! > 18:58:59.840 [spark-akka.actor.default-dispatcher-4] ERROR > o.a.s.s.cluster.ClusterScheduler - Exiting due to error from cluster > scheduler: Disconnected from Spark cluster > 18:58:59.844 [delete Spark local dirs] DEBUG > org.apache.spark.storage.DiskStore - Shutdown hook called > > > But when I use spark-shell it has no problems connecting to the master > using the exact same url: > > 13/12/10 18:59:40 INFO client.Client$ClientActor: Connecting to master > spark://ec2-50-16-80-0.compute-1.amazonaws.com:7077 > Spark context available as sc. > > I'm probably missing something obvious so any tips are very appreciated. >
