Re: Spark on Mesos with Jobs in Cluster Mode Documentation

Alan Braithwaite Sat, 19 Sep 2015 09:02:33 -0700

The assumption that the executor has no default properties set in it's
environment through the docker container.  Correct me if I'm wrong, but any
properties which are unset in the SparkContext will come from the
environment of the executor will it not?


Thanks,
- Alan

On Sat, Sep 19, 2015 at 1:09 AM, Tim Chen <t...@mesosphere.io> wrote:

> I guess I need a bit more clarification, what kind of assumptions was the
> dispatcher making?
>
> Tim
>
>
> On Thu, Sep 17, 2015 at 10:18 PM, Alan Braithwaite <a...@cloudflare.com>
> wrote:
>
>> Hi Tim,
>>
>> Thanks for the follow up.  It's not so much that I expect the executor to
>> inherit the configuration of the dispatcher as I* don't *expect the
>> dispatcher to make assumptions about the system environment of the executor
>> (since it lives in a docker).  I could potentially see a case where you
>> might want to explicitly forbid the defaults, but I can't think of any
>> right now.
>>
>> Otherwise, I'm confused as to why the defaults in the docker image for
>> the executor are just ignored.  I suppose that it's the dispatchers job to
>> ensure the *exact* configuration of the executor, regardless of the
>> defaults set on the executors machine?  Is that the assumption being made?
>> I can understand that in contexts which aren't docker driven since jobs
>> could be rolling out in the middle of a config update.  Trying to think of
>> this outside the terms of just mesos/docker (since I'm fully aware that
>> docker doesn't rule the world yet).
>>
>> So I can see this from both perspectives now and passing in the
>> properties file will probably work just fine for me, but for my better
>> understanding: When the executor starts, will it read any of the
>> environment that it's executing in or will it just take only the properties
>> given to it by the dispatcher and nothing more?
>>
>> Lemme know if anything needs more clarification and thanks for your mesos
>> contribution to spark!
>>
>> - Alan
>>
>> On Thu, Sep 17, 2015 at 5:03 PM, Timothy Chen <t...@mesosphere.io> wrote:
>>
>>> Hi Alan,
>>>
>>> If I understand correctly, you are setting executor home when you launch
>>> the dispatcher and not on the configuration when you submit job, and expect
>>> it to inherit that configuration?
>>>
>>> When I worked on the dispatcher I was assuming all configuration is
>>> passed to the dispatcher to launch the job exactly how you will need to
>>> launch it with client mode.
>>>
>>> But indeed it shouldn't crash dispatcher, I'll take a closer look when I
>>> get a chance.
>>>
>>> Can you recommend changes on the documentation, either in email or a PR?
>>>
>>> Thanks!
>>>
>>> Tim
>>>
>>> Sent from my iPhone
>>>
>>> On Sep 17, 2015, at 12:29 PM, Alan Braithwaite <a...@cloudflare.com>
>>> wrote:
>>>
>>> Hey All,
>>>
>>> To bump this thread once again, I'm having some trouble using the
>>> dispatcher as well.
>>>
>>> I'm using Mesos Cluster Manager with Docker Executors.  I've deployed
>>> the dispatcher as Marathon job.  When I submit a job using spark submit,
>>> the dispatcher writes back that the submission was successful and then
>>> promptly dies in marathon.  Looking at the logs reveals it was hitting the
>>> following line:
>>>
>>> 398:          throw new SparkException("Executor Spark home
>>> `spark.mesos.executor.home` is not set!")
>>>
>>> Which is odd because it's set in multiple places (SPARK_HOME,
>>> spark.mesos.executor.home, spark.home, etc).  Reading the code, it
>>> appears that the driver desc pulls only from the request and disregards any
>>> other properties that may be configured.  Testing by passing --conf
>>> spark.mesos.executor.home=/usr/local/spark on the command line to
>>> spark-submit confirms this.  We're trying to isolate the number of places
>>> where we have to set properties within spark and were hoping that it will
>>> be possible to have this pull in the spark-defaults.conf from somewhere, or
>>> at least allow the user to inform the dispatcher through spark-submit that
>>> those properties will be available once the job starts.
>>>
>>> Finally, I don't think the dispatcher should crash in this event.  It
>>> seems not exceptional that a job is misconfigured when submitted.
>>>
>>> Please direct me on the right path if I'm headed in the wrong
>>> direction.  Also let me know if I should open some tickets for these issues.
>>>
>>> Thanks,
>>> - Alan
>>>
>>> On Fri, Sep 11, 2015 at 1:05 PM, Tim Chen <t...@mesosphere.io> wrote:
>>>
>>>> Yes you can create an issue, or actually contribute a patch to update
>>>> it :)
>>>>
>>>> Sorry the docs is a bit light, I'm going to make it more complete along
>>>> the way.
>>>>
>>>> Tim
>>>>
>>>>
>>>> On Fri, Sep 11, 2015 at 11:11 AM, Tom Waterhouse (tomwater) <
>>>> tomwa...@cisco.com> wrote:
>>>>
>>>>> Tim,
>>>>>
>>>>> Thank you for the explanation.  You are correct, my Mesos experience
>>>>> is very light, and I haven’t deployed anything via Marathon yet.  What you
>>>>> have stated here makes sense, I will look into doing this.
>>>>>
>>>>> Adding this info to the docs would be great.  Is the appropriate
>>>>> action to create an issue regarding improvement of the docs?  For those of
>>>>> us who are gaining the experience having such a pointer is very helpful.
>>>>>
>>>>> Tom
>>>>>
>>>>> From: Tim Chen <t...@mesosphere.io>
>>>>> Date: Thursday, September 10, 2015 at 10:25 AM
>>>>> To: Tom Waterhouse <tomwa...@cisco.com>
>>>>> Cc: "user@spark.apache.org" <user@spark.apache.org>
>>>>> Subject: Re: Spark on Mesos with Jobs in Cluster Mode Documentation
>>>>>
>>>>> Hi Tom,
>>>>>
>>>>> Sorry the documentation isn't really rich, since it's probably
>>>>> assuming users understands how Mesos and framework works.
>>>>>
>>>>> First I need explain the rationale of why create the dispatcher. If
>>>>> you're not familiar with Mesos yet, each node in your datacenter is
>>>>> installed a Mesos slave where it's responsible for publishing resources 
>>>>> and
>>>>> running/watching tasks, and Mesos master is responsible for taking the
>>>>> aggregated resources and scheduling them among frameworks.
>>>>>
>>>>> Frameworks are not managed by Mesos, as Mesos master/slave doesn't
>>>>> launch and maintain framework but assume they're launched and kept running
>>>>> on its own. All the existing frameworks in the ecosystem therefore all 
>>>>> have
>>>>> their own ways to deploy, HA and persist state (e.g: Aurora, Marathon, 
>>>>> etc).
>>>>>
>>>>> Therefore, to introduce cluster mode with Mesos, we must create a
>>>>> framework that is long running that can be running in your datacenter, and
>>>>> can handle launching spark drivers on demand and handle HA, etc. This is
>>>>> what the dispatcher is all about.
>>>>>
>>>>> So the idea is that you should launch the dispatcher not on the
>>>>> client, but on a machine in your datacenter. In Mesosphere's DCOS we 
>>>>> launch
>>>>> all frameworks and long running services with Marathon, and you can use
>>>>> Marathon to launch the Spark dispatcher.
>>>>>
>>>>> Then all clients instead of specifying the Mesos master URL (e.g:
>>>>> mesos://mesos.master:2181), then just talks to the dispatcher only
>>>>> (mesos://spark-dispatcher.mesos:7077), and the dispatcher will then start
>>>>> and watch the driver for you.
>>>>>
>>>>> Tim
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Sep 10, 2015 at 10:13 AM, Tom Waterhouse (tomwater) <
>>>>> tomwa...@cisco.com> wrote:
>>>>>
>>>>>> After spending most of yesterday scouring the Internet for sources of
>>>>>> documentation for submitting Spark jobs in cluster mode to a Spark 
>>>>>> cluster
>>>>>> managed by Mesos I was able to do just that, but I am not convinced that
>>>>>> how I have things setup is correct.
>>>>>>
>>>>>> I used the Mesos published
>>>>>> <https://open.mesosphere.com/getting-started/datacenter/install/>
>>>>>> instructions for setting up my Mesos cluster.  I have three Zookeeper
>>>>>> instances, three Mesos master instances, and three Mesos slave instances.
>>>>>> This is all running in Openstack.
>>>>>>
>>>>>> The documentation on the Spark documentation site states that “To
>>>>>> use cluster mode, you must start the MesosClusterDispatcher in your 
>>>>>> cluster
>>>>>> via the sbin/start-mesos-dispatcher.sh script, passing in the Mesos
>>>>>> master url (e.g: mesos://host:5050).”  That is it, no more
>>>>>> information than that.  So that is what I did: I have one machine that I
>>>>>> use as the Spark client for submitting jobs.  I started the Mesos
>>>>>> dispatcher with script as described, and using the client machine’s IP
>>>>>> address and port as the target for the job submitted the job.
>>>>>>
>>>>>> The job is currently running in Mesos as expected.  This is not
>>>>>> however how I would have expected to configure the system.  As running
>>>>>> there is one instance of the Spark Mesos dispatcher running outside of
>>>>>> Mesos, so not a part of the sphere of Mesos resource management.
>>>>>>
>>>>>> I used the following Stack Overflow posts as guidelines:
>>>>>> http://stackoverflow.com/questions/31164725/spark-mesos-dispatcher
>>>>>> http://stackoverflow.com/questions/31294515/start-spark-via-mesos
>>>>>>
>>>>>> There must be better documentation on how to deploy Spark in Mesos
>>>>>> with jobs able to be deployed in cluster mode.
>>>>>>
>>>>>> I can follow up with more specific information regarding my
>>>>>> deployment if necessary.
>>>>>>
>>>>>> Tom
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Spark on Mesos with Jobs in Cluster Mode Documentation

Reply via email to