Hey Jason,

Might be related to what is behind your variable ALLUXIO_SPARK_CLIENT and where 
is located the lib (is it on HDFS, on the node that submits the job, or locally 
to all spark workers?)
There is a great post on SO about it: https://stackoverflow.com/a/37348234

We might as well check that you provide correctly the jar based on its 
location. I have found it tricky in some cases.
As a debug try, if the jar is not on HDFS, you can copy it there and then 
specify the full path in the extraclasspath property.


Yohann Jardin

Le 4/13/2018 à 5:38 PM, Jason Boorn a écrit :
I do, and this is what I will fall back to if nobody has a better idea :)

I was just hoping to get this working as it is much more convenient for my 
testing pipeline.

Thanks again for the help

On Apr 13, 2018, at 11:33 AM, Geoff Von Allmen wrote: 
Ok - `LOCAL` makes sense now.

Do you have the option to still use `spark-submit` in this scenario, but using 
the following options:

--master "local[*]" \
--deploy-mode "client" \

I know in the past, I have setup some options using `.config("Option", 
"value")` when creating the spark session, and then other runtime options as 
you describe above with `spark.conf.set`. At this point though I've just moved 
everything out into a `spark-submit` script.

On Fri, Apr 13, 2018 at 8:18 AM, Jason Boorn wrote: 
Hi Geoff -

Appreciate the help here - I do understand what you’re saying below.  And I am 
able to get this working when I submit a job to a local cluster.

I think part of the issue here is that there’s ambiguity in the terminology.  
When I say “LOCAL” spark, I mean an instance of spark that is created by my 
driver program, and is not a cluster itself.  It means that my master node is 
“local”, and this mode is primarily used for testing.


While I am able to get alluxio working with spark-submit, I am unable to get it 
working when using local mode.  The mechanisms for setting class paths during 
spark-submit are not available in local mode.  My understanding is that all one 
is able to use is:


To set any runtime properties of the local instance.  Note that it is possible 
(and I am more convinced of this as time goes on) that alluxio simply does not 
work in spark local mode as described above.

On Apr 13, 2018, at 11:09 AM, Geoff Von Allmen wrote: 
I fought with a ClassNotFoundException for quite some time, but it was for 

The final configuration that got everything working was running spark-submit 
with the following options:

--jars "/path/to/.ivy2/jars/package.jar" \
--driver-class-path "/path/to/.ivy2/jars/package.jar" \
--conf "spark.executor.extraClassPath=/path/to/.ivy2/package.jar" \
--packages org.some.package:package_name:version

While this was needed for me to run in cluster mode, it works equally well for 
client mode as well.

One other note when needing to supplied multiple items to these args - --jars 
and --packages should be comma separated, --driver-class-path and 
extraClassPath should be : separated



On Fri, Apr 13, 2018 at 4:28 AM, jb44 wrote: 
Haoyuan -

As I mentioned below, I've been through the documentation already.  It has
not helped me to resolve the issue.

Here is what I have tried so far:

- setting extraClassPath as explained below
- adding fs.alluxio.impl through sparkconf
- adding spark.sql.hive.metastore.sharedPrefixes (though I don't believe
this matters in my case)
- compiling the client from source

Do you have any other suggestions on how to get this working?


