I guess I was a little light on the details in my haste.  I'm using Spark on 
YARN, and this is in the driver process in yarn-client mode (most notably 
spark-shell).  I've had to manually add a bunch of JARs that I had thought it 
would just pick up like everything else does:

export 
SPARK_SUBMIT_LIBRARY_PATH="/usr/lib/hadoop/lib/native:/usr/lib/hadoop/lib/native/Linux-amd64-64:$SPARK_SUBMIT_LIBRARY_PATH"
export 
SPARK_SUBMIT_CLASSPATH="/usr/lib/hadoop/lib/hadoop-openstack-2.4.0.jar:/usr/lib/hadoop/lib/jackson-core-asl-1.8.8.jar:/usr/lib/spark-yarn/lib/datanucleus-api-jdo-3.2.6.jar:/usr/lib/spark-yarn/lib/datanucleus-core-3.2.10.jar:/usr/lib/spark-yarn/lib/datanucleus-rdbms-3.2.9.jar:/usr/lib/hadoop/lib/hadoop-lzo-0.6.0.jar:$SPARK_SUBMIT_CLASSPATH"

The lzo jar and the SPARK_SUBMIT_LIBRARY_PATH were required to get anything at 
all to work.  Without them, basic communication failed because it couldn't load 
the lzo library to compress/decompress the data.  The datanucleus stuff was 
required for hive on spark, and the hadoop-openstack and jackson jars are for 
the swiftfs hdfs plugin to work from within spark-shell.

I tried stuff like:

export SPARK_SUBMIT_CLASSPATH="/usr/lib/hadoop/lib/*"

But that didn't work at all.  I have to specify every individual jar like that.

Is there something I'm missing or some easier way to accomplish this?  I'm 
worried that I'll keep finding more missing dependencies as we explore other 
features and the classpath string is going to take up a whole screen.

Greg

From: Greg <[email protected]<mailto:[email protected]>>
Date: Tuesday, October 14, 2014 1:57 PM
To: "[email protected]<mailto:[email protected]>" 
<[email protected]<mailto:[email protected]>>
Subject: SPARK_SUBMIT_CLASSPATH question

It seems to me that SPARK_SUBMIT_CLASSPATH does not follow the same ability as 
other tools to put wildcards in the paths you add.  For some reason it doesn't 
pick up the classpath information from yarn-site.xml either, it seems, when 
running on YARN.  I'm having to manually add every single dependency JAR.  
There must be a better way, so what am I missing?

Greg

Reply via email to