That's funny I didn't delete that answer! I think I have two accounts crossing, here was the answer:
I don't know if this is going to help, but I agree that some of the docs would lead one to believe that the Spark driver or master is going to spread your jars around for you. But there's other docs that seem to contradict this, esp related to EC2 clusters. I wrote a Stack Overflow answer dealing with a similar situation, see if it helps: http://stackoverflow.com/questions/23687081/spark-workers-unable-to-find-jar-on-ec2-cluster/34502774#34502774 Pay attention to this section about the spark-submit docs: I must admit, as a limitation on this, it confuses me in the Spark docs that for spark.executor.extraClassPath it says: Users typically should not need to set this option I assume they mean most people will get the classpath out through a driver config option. I know most of the docs for spark-submit make it should like the script handles moving your code around the cluster but I think it only moves the classpath around for you. For example is this line from Launching Applications with spark-submit <http://spark.apache.org/docs/latest/submitting-applications.html#launching-applications-with-spark-submit> explicitly says you have to move the jars yourself or make them "globally available": application-jar: Path to a bundled jar including your application and all dependencies. The URL must be globally visible inside of your cluster, for instance, an hdfs:// path or a file:// path that is present on all nodes. -- View this message in context: http://apache-spark-user-list.1001560.n3.nabble.com/Spark-submit-does-automatically-upload-the-jar-to-cluster-tp25762p25826.html Sent from the Apache Spark User List mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: user-unsubscr...@spark.apache.org For additional commands, e-mail: user-h...@spark.apache.org