Hi Matt, If you place your jars on HDFS in a public location, YARN will cache them on each node after the first download. You can also use the spark.executor.extraClassPath config to point to them.
-Sandy On Wed, Jun 17, 2015 at 4:47 PM, Sweeney, Matt <mswee...@fourv.com> wrote: > Hi folks, > > I’m looking to deploy spark on YARN and I have read through the docs ( > https://spark.apache.org/docs/latest/running-on-yarn.html). One question > that I still have is if there is an alternate means of including your own > app jars as opposed to the process in the “Adding Other Jars” section of > the docs. The app jars and dependencies that I need to include are > significant in size (100s MBs) and I’d rather deploy them in advance onto > the cluster nodes disk so that I don’t have that overhead cost on the > network for each spark-submit that is executed. > > Thanks in advance for your help! > > Matt >