I'm having trouble running my Spark program as a "fat jar" on EC2.

This is the process I'm using:
(1) spark-ec2 script to launch cluster
(2) ssh to master, install sbt and git clone my project's source code
(3) update source to reference correct master and jar
(4) sbt assembly
(5) copy-dir to copy the jar to the rest of the cluster

I tried both running the jar (java -jar ...) and using sbt run, but I
always end up with this error:

18:58:59.556 [spark-akka.actor.default-dispatcher-4] INFO
 o.a.s.d.client.Client$ClientActor - Connecting to master spark://
ec2-50-16-80-0.compute-1.amazonaws.com:7077
18:58:59.838 [spark-akka.actor.default-dispatcher-4] ERROR
o.a.s.d.client.Client$ClientActor - Connection to master failed; stopping
client
18:58:59.839 [spark-akka.actor.default-dispatcher-4] ERROR
o.a.s.s.c.SparkDeploySchedulerBackend - Disconnected from Spark cluster!
18:58:59.840 [spark-akka.actor.default-dispatcher-4] ERROR
o.a.s.s.cluster.ClusterScheduler - Exiting due to error from cluster
scheduler: Disconnected from Spark cluster
18:58:59.844 [delete Spark local dirs] DEBUG
org.apache.spark.storage.DiskStore - Shutdown hook called


But when I use spark-shell it has no problems connecting to the master
using the exact same url:

13/12/10 18:59:40 INFO client.Client$ClientActor: Connecting to master
spark://ec2-50-16-80-0.compute-1.amazonaws.com:7077
Spark context available as sc.

I'm probably missing something obvious so any tips are very appreciated.

Reply via email to