We build some SPARK jobs with external jars. I compile jobs by including them
in one assembly.
But look for an approach to put all external jars into HDFS.
We have already put spark jar in a HDFS folder and set up the variable
SPARK_JAR.
What is the best way to do that for other external jars
SparkContext.addJar()?
Why you didn't like fat jar way?
2014-09-25 16:25 GMT+04:00 rzykov rzy...@gmail.com:
We build some SPARK jobs with external jars. I compile jobs by including
them
in one assembly.
But look for an approach to put all external jars into HDFS.
We have already put
You can pass the HDFS location of those extra jars in the spark-submit
--jars argument. Spark will take care of using Yarn's distributed
cache to make them available to the executors. Note that you may need
to provide the full hdfs URL (not just the path, since that will be
interpreted as a local