Hi Gene - 

Are you saying that I just need to figure out how to get the Alluxio jar into 
the classpath of my parent application?  If it shows up in the classpath then 
Spark will automatically know that it needs to use it when communicating with 
Alluxio?

Apologies for going back-and-forth on this - I feel like my particular use case 
is clouding what is already a tricky issue.

> On Apr 13, 2018, at 2:26 PM, Gene Pang <gene.p...@gmail.com> wrote:
> 
> Hi Jason,
> 
> Alluxio does work with Spark in master=local mode. This is because both 
> spark-submit and spark-shell have command-line options to set the classpath 
> for the JVM that is being started.
> 
> If you are not using spark-submit or spark-shell, you will have to figure out 
> how to configure that JVM instance with the proper properties.
> 
> Thanks,
> Gene
> 
> On Fri, Apr 13, 2018 at 10:47 AM, Jason Boorn <jbo...@gmail.com 
> <mailto:jbo...@gmail.com>> wrote:
> Ok thanks - I was basing my design on this:
> 
> https://databricks.com/blog/2016/08/15/how-to-use-sparksession-in-apache-spark-2-0.html
>  
> <https://databricks.com/blog/2016/08/15/how-to-use-sparksession-in-apache-spark-2-0.html>
> 
> Wherein it says:
> Once the SparkSession is instantiated, you can configure Spark’s runtime 
> config properties. 
> Apparently the suite of runtime configs you can change does not include 
> classpath.  
> 
> So the answer to my original question is basically this:
> 
> When using local (pseudo-cluster) mode, there is no way to add external jars 
> to the spark instance.  This means that Alluxio will not work with Spark when 
> Spark is run in master=local mode.
> 
> Thanks again - often getting a definitive “no” is almost as good as a yes.  
> Almost ;)
> 
>> On Apr 13, 2018, at 1:21 PM, Marcelo Vanzin <van...@cloudera.com 
>> <mailto:van...@cloudera.com>> wrote:
>> 
>> There are two things you're doing wrong here:
>> 
>> On Thu, Apr 12, 2018 at 6:32 PM, jb44 <jbo...@gmail.com 
>> <mailto:jbo...@gmail.com>> wrote:
>>> Then I can add the alluxio client library like so:
>>> sparkSession.conf.set("spark.driver.extraClassPath", ALLUXIO_SPARK_CLIENT)
>> 
>> First one, you can't modify JVM configuration after it has already
>> started. So this line does nothing since it can't re-launch your
>> application with a new JVM.
>> 
>>> sparkSession.conf.set("spark.executor.extraClassPath", ALLUXIO_SPARK_CLIENT)
>> 
>> There is a lot of configuration that you cannot set after the
>> application has already started. For example, after the session is
>> created, most probably this option will be ignored, since executors
>> will already have started.
>> 
>> I'm not so sure about what happens when you use dynamic allocation,
>> but these post-hoc config changes in general are not expected to take
>> effect.
>> 
>> The documentation could be clearer about this (especially stuff that
>> only applies to spark-submit), but that's the gist of it.
>> 
>> 
>> -- 
>> Marcelo
> 
> 

Reply via email to