I’ve put my Spark JAR into HDFS, and specify the SPARK_JAR variable to point to the HDFS location of the jar. I’m not using any specialized configuration files (like spark-env.sh), but rather setting things either by environment variable per node, passing application arguments to the job, or making a Zookeeper connection from my job to seed properties. From there, I can construct a SparkConf as necessary.
mn On Sep 2, 2014, at 9:06 AM, Greg Hill <greg.h...@rackspace.com> wrote: > I'm working on setting up Spark on YARN using the HDP technical preview - > http://hortonworks.com/kb/spark-1-0-1-technical-preview-hdp-2-1-3/ > > I have installed the Spark JARs on all the slave nodes and configured YARN to > find the JARs. It seems like everything is working. > > Unless I'm misunderstanding, it seems like there isn't any configuration > required on the YARN slave nodes at all, apart from telling YARN where to > find the Spark JAR files. Do the YARN processes even pick up local Spark > configuration files on the slave nodes, or is that all just pulled in on the > client and passed along to YARN? > > Greg