I ran into a similar issue a few months back - pay careful attention to the 
order in which spark decides to look for your jars. The root of my problem was 
a stale jar in SPARK_CLASSPATH on the worker nodes, which took precedence 
(IIRC) over jars passed in with the SparkContext constructor. 

> On Dec 20, 2013, at 8:49 PM, "K. Shankari" <[email protected]> wrote:
> 
> I don't think that you need to copy the jar to the rest of the cluster - you 
> should be able to do addJar() in the SparkContext and spark should 
> automatically push the jars to the client for you.
> 
> I don't know how set you are on running code through checking out and 
> compiling, but here's what I do instead to get my own application to run:
> - compile my code on my desktop and generate a jar
> - scp the jar to the master
> - modify runExample to include the jar in the classpath. I think that you can 
> also just modify SPARK_CLASSPATH
> - run using something like:
> 
> $ runExample my.class.name arg1 arg2 arg3
> 
> Hope this helps!
> Shankari
> 
> 
>> On Tue, Dec 10, 2013 at 12:15 PM, Jeff Higgens <[email protected]> wrote:
>> I'm having trouble running my Spark program as a "fat jar" on EC2.
>> 
>> This is the process I'm using:
>> (1) spark-ec2 script to launch cluster
>> (2) ssh to master, install sbt and git clone my project's source code
>> (3) update source to reference correct master and jar
>> (4) sbt assembly
>> (5) copy-dir to copy the jar to the rest of the cluster
>> 
>> I tried both running the jar (java -jar ...) and using sbt run, but I always 
>> end up with this error:
>> 
>> 18:58:59.556 [spark-akka.actor.default-dispatcher-4] INFO  
>> o.a.s.d.client.Client$ClientActor - Connecting to master 
>> spark://ec2-50-16-80-0.compute-1.amazonaws.com:7077
>> 18:58:59.838 [spark-akka.actor.default-dispatcher-4] ERROR 
>> o.a.s.d.client.Client$ClientActor - Connection to master failed; stopping 
>> client
>> 18:58:59.839 [spark-akka.actor.default-dispatcher-4] ERROR 
>> o.a.s.s.c.SparkDeploySchedulerBackend - Disconnected from Spark cluster!
>> 18:58:59.840 [spark-akka.actor.default-dispatcher-4] ERROR 
>> o.a.s.s.cluster.ClusterScheduler - Exiting due to error from cluster 
>> scheduler: Disconnected from Spark cluster
>> 18:58:59.844 [delete Spark local dirs] DEBUG 
>> org.apache.spark.storage.DiskStore - Shutdown hook called
>> 
>> 
>> But when I use spark-shell it has no problems connecting to the master using 
>> the exact same url: 
>> 
>> 13/12/10 18:59:40 INFO client.Client$ClientActor: Connecting to master 
>> spark://ec2-50-16-80-0.compute-1.amazonaws.com:7077
>> Spark context available as sc.
>> 
>> I'm probably missing something obvious so any tips are very appreciated.
> 

Reply via email to