I guess I was a little light on the details in my haste. I'm using Spark on YARN, and this is in the driver process in yarn-client mode (most notably spark-shell). I've had to manually add a bunch of JARs that I had thought it would just pick up like everything else does:
export SPARK_SUBMIT_LIBRARY_PATH="/usr/lib/hadoop/lib/native:/usr/lib/hadoop/lib/native/Linux-amd64-64:$SPARK_SUBMIT_LIBRARY_PATH" export SPARK_SUBMIT_CLASSPATH="/usr/lib/hadoop/lib/hadoop-openstack-2.4.0.jar:/usr/lib/hadoop/lib/jackson-core-asl-1.8.8.jar:/usr/lib/spark-yarn/lib/datanucleus-api-jdo-3.2.6.jar:/usr/lib/spark-yarn/lib/datanucleus-core-3.2.10.jar:/usr/lib/spark-yarn/lib/datanucleus-rdbms-3.2.9.jar:/usr/lib/hadoop/lib/hadoop-lzo-0.6.0.jar:$SPARK_SUBMIT_CLASSPATH" The lzo jar and the SPARK_SUBMIT_LIBRARY_PATH were required to get anything at all to work. Without them, basic communication failed because it couldn't load the lzo library to compress/decompress the data. The datanucleus stuff was required for hive on spark, and the hadoop-openstack and jackson jars are for the swiftfs hdfs plugin to work from within spark-shell. I tried stuff like: export SPARK_SUBMIT_CLASSPATH="/usr/lib/hadoop/lib/*" But that didn't work at all. I have to specify every individual jar like that. Is there something I'm missing or some easier way to accomplish this? I'm worried that I'll keep finding more missing dependencies as we explore other features and the classpath string is going to take up a whole screen. Greg From: Greg <[email protected]<mailto:[email protected]>> Date: Tuesday, October 14, 2014 1:57 PM To: "[email protected]<mailto:[email protected]>" <[email protected]<mailto:[email protected]>> Subject: SPARK_SUBMIT_CLASSPATH question It seems to me that SPARK_SUBMIT_CLASSPATH does not follow the same ability as other tools to put wildcards in the paths you add. For some reason it doesn't pick up the classpath information from yarn-site.xml either, it seems, when running on YARN. I'm having to manually add every single dependency JAR. There must be a better way, so what am I missing? Greg
