I'm using Spark 1.5.0 with the standalone scheduler, and for the life of me I 
can't figure out why this isn't working.  I have an application that works fine 
with --deploy-mode client that I'm trying to get to run in cluster mode so I 
can use --supervise.  I ran into a few issues with my configuration that I had 
to sort out (classpath stuff mostly), but now I'm stumped.  We rely on the 
databricks spark csv plugin.  We're loading that using --packages 
"com.databricks:spark-csv_2.11:1.2.0".  This works without issue in client 
mode, but when run in cluster mode, it tries to load the spark-csv jar from 
/root/.ivy2 and fails because that folder doesn't exist on the slave node that 
ends up running the driver.  Does --packages not work when the driver is loaded 
on the cluster?  Does it download the jars in the client before loading the 
driver on the cluster and doesn't pass along the downloaded JARs?

Here's my stderr output:

https://gist.github.com/jimbobhickville/1f10b3508ef946eccb92

Thanks in advance for any suggestions.

Greg

Reply via email to