I'm using Spark 1.5.0 with the standalone scheduler, and for the life of me I can't figure out why this isn't working. I have an application that works fine with --deploy-mode client that I'm trying to get to run in cluster mode so I can use --supervise. I ran into a few issues with my configuration that I had to sort out (classpath stuff mostly), but now I'm stumped. We rely on the databricks spark csv plugin. We're loading that using --packages "com.databricks:spark-csv_2.11:1.2.0". This works without issue in client mode, but when run in cluster mode, it tries to load the spark-csv jar from /root/.ivy2 and fails because that folder doesn't exist on the slave node that ends up running the driver. Does --packages not work when the driver is loaded on the cluster? Does it download the jars in the client before loading the driver on the cluster and doesn't pass along the downloaded JARs?
Here's my stderr output: https://gist.github.com/jimbobhickville/1f10b3508ef946eccb92 Thanks in advance for any suggestions. Greg