Thanks - I’ve seen this SO post, it covers spark-submit, which I am not using.
Regarding the ALLUXIO_SPARK_CLIENT variable, it is located on the machine that is running the job which spawns the master=local spark. According to the Spark documentation, this should be possible, but it appears it is not. Once again - I’m trying to solve the use case for master=local, NOT for a cluster and NOT with spark-submit. > On Apr 13, 2018, at 12:47 PM, yohann jardin <yohannjar...@hotmail.com> wrote: > > Hey Jason, > Might be related to what is behind your variable ALLUXIO_SPARK_CLIENT and > where is located the lib (is it on HDFS, on the node that submits the job, or > locally to all spark workers?) > There is a great post on SO about it: https://stackoverflow.com/a/37348234 > <https://stackoverflow.com/a/37348234> > We might as well check that you provide correctly the jar based on its > location. I have found it tricky in some cases. > As a debug try, if the jar is not on HDFS, you can copy it there and then > specify the full path in the extraclasspath property. > Regards, > Yohann Jardin > > Le 4/13/2018 à 5:38 PM, Jason Boorn a écrit : >> I do, and this is what I will fall back to if nobody has a better idea :) >> >> I was just hoping to get this working as it is much more convenient for my >> testing pipeline. >> >> Thanks again for the help >> >>> On Apr 13, 2018, at 11:33 AM, Geoff Von Allmen <ge...@ibleducation.com >>> <mailto:ge...@ibleducation.com>> wrote: >>> >>> Ok - `LOCAL` makes sense now. >>> >>> Do you have the option to still use `spark-submit` in this scenario, but >>> using the following options: >>> >>> ```bash >>> --master "local[*]" \ >>> --deploy-mode "client" \ >>> ... >>> ``` >>> >>> I know in the past, I have setup some options using `.config("Option", >>> "value")` when creating the spark session, and then other runtime options >>> as you describe above with `spark.conf.set`. At this point though I've just >>> moved everything out into a `spark-submit` script. >>> >>> On Fri, Apr 13, 2018 at 8:18 AM, Jason Boorn <jbo...@gmail.com >>> <mailto:jbo...@gmail.com>> wrote: >>> Hi Geoff - >>> >>> Appreciate the help here - I do understand what you’re saying below. And I >>> am able to get this working when I submit a job to a local cluster. >>> >>> I think part of the issue here is that there’s ambiguity in the >>> terminology. When I say “LOCAL” spark, I mean an instance of spark that is >>> created by my driver program, and is not a cluster itself. It means that >>> my master node is “local”, and this mode is primarily used for testing. >>> >>> https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-local.html >>> >>> <https://jaceklaskowski.gitbooks.io/mastering-apache-spark/content/spark-local.html> >>> >>> While I am able to get alluxio working with spark-submit, I am unable to >>> get it working when using local mode. The mechanisms for setting class >>> paths during spark-submit are not available in local mode. My >>> understanding is that all one is able to use is: >>> >>> spark.conf.set(“”) >>> >>> To set any runtime properties of the local instance. Note that it is >>> possible (and I am more convinced of this as time goes on) that alluxio >>> simply does not work in spark local mode as described above. >>> >>> >>>> On Apr 13, 2018, at 11:09 AM, Geoff Von Allmen <ge...@ibleducation.com >>>> <mailto:ge...@ibleducation.com>> wrote: >>>> >>>> I fought with a >>>> ClassNotFoundException for quite some time, but it was for kafka. >>>> >>>> The final configuration that got everything working was running >>>> spark-submit with the following options: >>>> >>>> --jars "/path/to/.ivy2/jars/package.jar" \ >>>> --driver-class-path "/path/to/.ivy2/jars/package.jar" \ >>>> --conf "spark.executor.extraClassPath=/path/to/.ivy2/package.jar" \ >>>> --packages org.some.package:package_name:version >>>> While this was needed for me to run in >>>> cluster mode, it works equally well for >>>> client mode as well. >>>> >>>> One other note when needing to supplied multiple items to these args - >>>> --jars and >>>> --packages should be comma separated, >>>> --driver-class-path and >>>> extraClassPath should be >>>> : separated >>>> >>>> HTH >>>> >>>> >>>> On Fri, Apr 13, 2018 at 4:28 AM, jb44 <jbo...@gmail.com >>>> <mailto:jbo...@gmail.com>> wrote: >>>> Haoyuan - >>>> >>>> As I mentioned below, I've been through the documentation already. It has >>>> not helped me to resolve the issue. >>>> >>>> Here is what I have tried so far: >>>> >>>> - setting extraClassPath as explained below >>>> - adding fs.alluxio.impl through sparkconf >>>> - adding spark.sql.hive.metastore.sharedPrefixes (though I don't believe >>>> this matters in my case) >>>> - compiling the client from source >>>> >>>> Do you have any other suggestions on how to get this working? >>>> >>>> Thanks >>>> >>>> >>>> >>>> -- >>>> Sent from: http://apache-spark-user-list.1001560.n3.nabble.com/ >>>> <http://apache-spark-user-list.1001560.n3.nabble.com/> >>>> >>>> --------------------------------------------------------------------- >>>> To unsubscribe e-mail: user-unsubscr...@spark.apache.org >>>> <mailto:user-unsubscr...@spark.apache.org> >>>> >>>> >>> >>> >> >