Hi all, In the process of trying to resolve SPARK-3166 (inability to ship custom serialisers in application jars) https://issues.apache.org/jira/browse/SPARK-3166 I've discovered that there's a bit of duplicated code for building the command for launching Executors across SparkDeploySchedulerBackend.scala, MesosSchedulerBackend.scala, and CoarseMesosSchedulerBackend.scala
Importantly, there is a slight difference in their behaviour where SparkDeploySchedulerBackend doesn't launch the Executor with the spark-class script, but instead tries to do something similar in CommandUtils.scala. MesosSchedulerBackend.scala and CoarseMesosSchedulerBackend.scala both use the spark-class script. Is the latter the preferred approach? So should I refactor all of these to use spark-class, or is there a reason for the differing behaviour? Secondly, the goal of SPARK-3166 is to have the user jar available to the executor process at launch time (rather than when the first task is received). I'd like to get some feedback on what the preferred classpath order should be. The items to be ordered to determine the classpath are: * The output of the compute-classpath script * The config option spark.executor.extraClassPath * The application jar (and anything added via SparkContext.addJar) Complicating the matter is that the 'deploy' backend currently supports the "spark.files.userClassPathFirst" option, but this is not supported by the Mesos backends (and I don't think it's supported by the YARN backend). Ignoring the "userClassPathFirst" option, the current behaviour for the classpath is effectively: 1. The output of compute-classpath 2. The config option spark.executor.extraClassPath 3. The application jar (and anything added via SparkContext.addJar). What should the preferred order be if userClassPathFirst is true? Currently the behaviour for the Deploy backend is effectively: 1. The application jar (and anything added via SparkContext.addJar) 2. The output of compute-classpath 3. The config option spark.executor.extraClassPath To me it makes more sense for this to be in the order (application jar; spark.executor.extraClassPath; compute-classpath). Agree? Disagree? Thanks, Graham