I don't think that you need to copy the jar to the rest of the cluster -
you should be able to do addJar() in the SparkContext and spark should
automatically push the jars to the client for you.

I don't know how set you are on running code through checking out and
compiling, but here's what I do instead to get my own application to run:
- compile my code on my desktop and generate a jar
- scp the jar to the master
- modify runExample to include the jar in the classpath. I think that you
can also just modify SPARK_CLASSPATH
- run using something like:

$ runExample my.class.name arg1 arg2 arg3

Hope this helps!
Shankari


On Tue, Dec 10, 2013 at 12:15 PM, Jeff Higgens <[email protected]> wrote:

> I'm having trouble running my Spark program as a "fat jar" on EC2.
>
> This is the process I'm using:
> (1) spark-ec2 script to launch cluster
> (2) ssh to master, install sbt and git clone my project's source code
> (3) update source to reference correct master and jar
> (4) sbt assembly
> (5) copy-dir to copy the jar to the rest of the cluster
>
> I tried both running the jar (java -jar ...) and using sbt run, but I
> always end up with this error:
>
> 18:58:59.556 [spark-akka.actor.default-dispatcher-4] INFO
>  o.a.s.d.client.Client$ClientActor - Connecting to master spark://
> ec2-50-16-80-0.compute-1.amazonaws.com:7077
> 18:58:59.838 [spark-akka.actor.default-dispatcher-4] ERROR
> o.a.s.d.client.Client$ClientActor - Connection to master failed; stopping
> client
> 18:58:59.839 [spark-akka.actor.default-dispatcher-4] ERROR
> o.a.s.s.c.SparkDeploySchedulerBackend - Disconnected from Spark cluster!
> 18:58:59.840 [spark-akka.actor.default-dispatcher-4] ERROR
> o.a.s.s.cluster.ClusterScheduler - Exiting due to error from cluster
> scheduler: Disconnected from Spark cluster
> 18:58:59.844 [delete Spark local dirs] DEBUG
> org.apache.spark.storage.DiskStore - Shutdown hook called
>
>
> But when I use spark-shell it has no problems connecting to the master
> using the exact same url:
>
> 13/12/10 18:59:40 INFO client.Client$ClientActor: Connecting to master
> spark://ec2-50-16-80-0.compute-1.amazonaws.com:7077
> Spark context available as sc.
>
> I'm probably missing something obvious so any tips are very appreciated.
>

Reply via email to