Re: Should SPARK_HOME be needed with Mesos?

Andrew Ash Thu, 22 May 2014 12:27:26 -0700

Fixing the immediate issue of requiring SPARK_HOME to be set when it's not
actually used is a separate ticket in my mind from a larger cleanup of what
SPARK_HOME means across the cluster.


I think you should file a new ticket for just this particular issue.


On Thu, May 22, 2014 at 11:03 AM, Gerard Maas <gerard.m...@gmail.com> wrote:

> Sure.  Should I create a Jira as well?
>
> I saw there's already a broader ticket regarding the ambiguous use of
> SPARK_HOME [1]  (cc: Patrick as owner of that ticket)
>
> I don't know if it would be more relevant to remove the use of SPARK_HOME
> when using mesos and have the assembly as the only way forward, or whether
> that's a too radical change that might break some existing systems.
>
> From a real-world ops perspective, the assembly should be the way to go. I
> don't see installing and configuring Spark distros on a mesos master as a
> way to have the mesos executor in place.
>
> -kr, Gerard.
>
> [1] https://issues.apache.org/jira/browse/SPARK-1110
>
>
> On Thu, May 22, 2014 at 6:19 AM, Andrew Ash <and...@andrewash.com> wrote:
>
>> Hi Gerard,
>>
>> I agree that your second option seems preferred.  You shouldn't have to
>> specify a SPARK_HOME if the executor is going to use the
>> spark.executor.uri
>> instead.  Can you send in a pull request that includes your proposed
>> changes?
>>
>> Andrew
>>
>>
>> On Wed, May 21, 2014 at 10:19 AM, Gerard Maas <gerard.m...@gmail.com>
>> wrote:
>>
>> > Spark dev's,
>> >
>> > I was looking into a question asked on the user list where a
>> > ClassNotFoundException was thrown when running a job on Mesos. Curious
>> > issue with serialization on Mesos: more details here [1]:
>> >
>> > When trying to run that simple example on my Mesos installation, I faced
>> > another issue: I got an error that "SPARK_HOME" was not set. I found
>> that
>> > curious b/c a local spark installation should not be required to run a
>> job
>> > on Mesos. All that's needed is the executor package, being the
>> > assembly.tar.gz on a reachable location (HDFS/S3/HTTP).
>> >
>> > I went looking into the code and indeed there's a check on SPARK_HOME
>> [2]
>> > regardless of the presence of the assembly but it's actually only used
>> if
>> > the assembly is not provided (which is a kind-of best-effort recovery
>> > strategy).
>> >
>> > Current flow:
>> >
>> > if (!SPARK_HOME) fail("No SPARK_HOME")
>> > else if (assembly) { use assembly) }
>> > else { try use SPARK_HOME to build spark_executor }
>> >
>> > Should be:
>> > sparkExecutor =  if (assembly) {assembly}
>> >                  else if (SPARK_HOME) {try use SPARK_HOME to build
>> > spark_executor}
>> >                  else { fail("No executor found. Please provide
>> > spark.executor.uri (preferred) or spark.home")
>> >
>> > What do you think?
>> >
>> > -kr, Gerard.
>> >
>> >
>> > [1]
>> >
>> >
>> http://apache-spark-user-list.1001560.n3.nabble.com/ClassNotFoundException-with-Spark-Mesos-spark-shell-works-fine-td6165.html
>> >
>> > [2]
>> >
>> >
>> https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MesosSchedulerBackend.scala#L89
>> >
>>
>
>

Re: Should SPARK_HOME be needed with Mesos?

Reply via email to