Fixing the immediate issue of requiring SPARK_HOME to be set when it's not actually used is a separate ticket in my mind from a larger cleanup of what SPARK_HOME means across the cluster.
I think you should file a new ticket for just this particular issue. On Thu, May 22, 2014 at 11:03 AM, Gerard Maas <gerard.m...@gmail.com> wrote: > Sure. Should I create a Jira as well? > > I saw there's already a broader ticket regarding the ambiguous use of > SPARK_HOME [1] (cc: Patrick as owner of that ticket) > > I don't know if it would be more relevant to remove the use of SPARK_HOME > when using mesos and have the assembly as the only way forward, or whether > that's a too radical change that might break some existing systems. > > From a real-world ops perspective, the assembly should be the way to go. I > don't see installing and configuring Spark distros on a mesos master as a > way to have the mesos executor in place. > > -kr, Gerard. > > [1] https://issues.apache.org/jira/browse/SPARK-1110 > > > On Thu, May 22, 2014 at 6:19 AM, Andrew Ash <and...@andrewash.com> wrote: > >> Hi Gerard, >> >> I agree that your second option seems preferred. You shouldn't have to >> specify a SPARK_HOME if the executor is going to use the >> spark.executor.uri >> instead. Can you send in a pull request that includes your proposed >> changes? >> >> Andrew >> >> >> On Wed, May 21, 2014 at 10:19 AM, Gerard Maas <gerard.m...@gmail.com> >> wrote: >> >> > Spark dev's, >> > >> > I was looking into a question asked on the user list where a >> > ClassNotFoundException was thrown when running a job on Mesos. Curious >> > issue with serialization on Mesos: more details here [1]: >> > >> > When trying to run that simple example on my Mesos installation, I faced >> > another issue: I got an error that "SPARK_HOME" was not set. I found >> that >> > curious b/c a local spark installation should not be required to run a >> job >> > on Mesos. All that's needed is the executor package, being the >> > assembly.tar.gz on a reachable location (HDFS/S3/HTTP). >> > >> > I went looking into the code and indeed there's a check on SPARK_HOME >> [2] >> > regardless of the presence of the assembly but it's actually only used >> if >> > the assembly is not provided (which is a kind-of best-effort recovery >> > strategy). >> > >> > Current flow: >> > >> > if (!SPARK_HOME) fail("No SPARK_HOME") >> > else if (assembly) { use assembly) } >> > else { try use SPARK_HOME to build spark_executor } >> > >> > Should be: >> > sparkExecutor = if (assembly) {assembly} >> > else if (SPARK_HOME) {try use SPARK_HOME to build >> > spark_executor} >> > else { fail("No executor found. Please provide >> > spark.executor.uri (preferred) or spark.home") >> > >> > What do you think? >> > >> > -kr, Gerard. >> > >> > >> > [1] >> > >> > >> http://apache-spark-user-list.1001560.n3.nabble.com/ClassNotFoundException-with-Spark-Mesos-spark-shell-works-fine-td6165.html >> > >> > [2] >> > >> > >> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala#L89 >> > >> > >