Hi Nupur, Is what you're trying to do already possible via the spark.{driver,executor}.userClassPathFirst options?
https://github.com/apache/spark/blob/b890fdc8df64f1d0b0f78b790d36be883e852b0d/core/src/main/scala/org/apache/spark/deploy/SparkSubmit.scala#L853 On Wed, Jul 22, 2020 at 5:50 PM nupurshukla <nupur14shu...@gmail.com> wrote: > Hello, > > I am prototyping a change in the behavior of spark.jars conf for my > use-case. spark.jars conf is used to specify a list of jars to include on > the driver and executor classpaths. > > *Current behavior:* spark.jars conf value is not read until after the JVM > has already started and the system classloader has already loaded, and > hence > the jars added using this conf get “appended” to the spark classpath. This > means that spark looks for the jar in its default classpath first and then > looks at the path specified in spark.jars conf. > > *Proposed prototype:* I am proposing a new behavior where we can have > spark.jars take precedence over spark default classpath in terms of how > jars > are discovered. This can be achieved by using > spark.{driver,executor}.extraClassPath conf. This conf modifies the actual > launch command of the driver (or executors), and hence this path is > "prepended" to the classpath and thus takes precedence over the default > classpath. Can the behavior of conf spark.jars be modified by adding the > conf value of spark.jars to conf value of > spark.{driver,executor}.extraClassPath during argument parsing in > SparkSubmitArguments.scala > < > https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala#L151> > > , so that we can achieve precedence order of jars specified in spark.jars > > > spark.{driver,executor}.extraClassPath > spark default classpath (left to > right precedence order) > > *Pseudo sample code:* > In loadEnvironmentArguments() > < > https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/deploy/SparkSubmitArguments.scala#L151> > > : > > /if (jars != null) { > if (driverExtraClassPath != null) { > driverExtraClassPath = driverExtraClassPath + "," + jars > } > else { > driverExtraClassPath = jars > } > }/ > > > *As an example*, consider jars : > sample-jar-1.0.0.jar present in spark’s default classpath > sample-jar-2.0.0.jar present on all nodes of the cluster at path > /<somepath>/ > new-jar-1.0.0.jar present on all nodes of the cluster at path /<somepath>/ > (and not in spark default classpath) > > And two scenarios 2 spark jobs are submitted with the following – jars conf > values > > < > http://apache-spark-developers-list.1001551.n3.nabble.com/file/t3705/Capture.png> > > > > What are your thoughts on this? Could this have any undesired side-effects? > Or has this already been explored and there are some known issues with this > approach? > > Thanks, > Nupur > > > > -- > Sent from: http://apache-spark-developers-list.1001551.n3.nabble.com/ > > --------------------------------------------------------------------- > To unsubscribe e-mail: dev-unsubscr...@spark.apache.org > >