Re: Spark/Mesos

Timothy Chen Wed, 06 May 2015 14:30:19 -0700

So, to confirm - in this mode, when a Spark application/context runs a
series of tasks, each task will launch a full SparkExecutor process?
What is the cpu/mem cost of such Spark Executor process (resource
sizing passed in the Mesos task launch request)?
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------


Each Spark task is launched on a running MesosExecutorBackend process,
with the task data serialized and passed to the executor. So this is
not a full JVM process, you can look at MesosExecutorBackend to see
how it's being launched.
Each Spark task is sized to be spark.task.cpus (default 1) and memory
is calculated by
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/mesos/MemoryUtils.scala.

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
I see. Is a per-task resource management somehow performed in this
mode? In other words, if the size of CoarseGrainedExecutorBackend
process is M megabytes of memory and C cpu cores, how many Spark tasks
can be sent to it for execution at the same time (before starting to
queue them at the driver process)? Is each Spark task sized in terms
of cpu/mem?
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

CoarseGrainedSchedulerBackend is responsible for choosing how many
tasks it will be sending to each running ExecutorBackend, you can
follow the logic at
https://github.com/apache/spark/blob/master/core/src/main/scala/org/apache/spark/scheduler/cluster/CoarseGrainedSchedulerBackend.scala#L167


------------------------------------------------------------------------------------------------------------------------------------------------------------------------
What is the current target for a release (or patch landing) of the
dynamic allocation in Spark/Mesos?
------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Dynamic allocation for coarse grained mode I'm hoping it lands 1.4,
but will see how it goes! (https://github.com/apache/spark/pull/4984)

On Wed, May 6, 2015 at 4:53 AM, Gidon Gershinsky <gi...@il.ibm.com> wrote:
> Thanks Tim, a few  follow-up questions using the Mesos|Spark prefixing -
>
>
>
>> 2. In fine grained mode, what happens is that Spark scheduler
>> specifies a custom Mesos executor per slave, and each Mesos task is a
>> Spark executor that will be launched by the Mesos executor. It's hard
>> to determine what exactly you're asking since task and executors are
>> both terms used in Spark and Mesos, perhaps prefixing (Mesos|Spark)
>> task will clarify more what you're asking about.
>>
>> I'm not sure what you mean by slice of app executor, but in fine grain
>> mode there is a fixed cost of resource to launch a per slave executor,
>> and then cpu/mem cost to launch each Mesos task that launches a Spark
>> executor. Each framework is given offers by Mesos master and each have
>> the opportunity to use a offer or not.
>
> So, to confirm - in this mode, when a Spark application/context runs a
> series of tasks, each task will launch a full Spark Executor process?
> What is the cpu/mem cost of such Spark Executor process (resource sizing
> passed in the Mesos task launch request)?
>
>
>>
>> 3. In coarse-grained mode the scheduler launches
>> CoarseGrainedExecutorBackend on each slave, and it will be registering
>> back to the CoarseGrainedSchedulerBackend via the akka driverUrl. Then
>> the CoarseGrainedSchedulerBackend can scheduler individual Spark tasks
>> to those long running executor backends. These mini-tasks I believe
>> it's the same as Spark tasks, but instead of running a Mesos Task per
>> Spark task it's distributing these tasks to these long running Spark
>> executors.
>
> I see. Is a per-task resource management somehow performed in this mode? In
> other words, if the size of CoarseGrainedExecutorBackend
> process is M megabytes of memory and C cpu cores, how many Spark tasks can
> be sent to it for execution at the same time (before starting to queue them
> at the driver process)? Is each Spark task sized in terms of cpu/mem?
>
>>
>> Mesos Resources becomes more static in coarse grained mode as it will
>> just launch a number of these CoarseGrainedExecutorBackends and keep
>> them running until the driver stops. Note this is subject to change
>> with dynamic allocation and other Spark/Mesos patches going into
>> Spark.
>
>
> What is the current target for a release (or patch landing) of the dynamic
> allocation in Spark/Mesos?
>
>
>>
>> Tim
>>
>> On Tue, May 5, 2015 at 6:19 AM, Gidon Gershinsky <gi...@il.ibm.com> wrote:
>> > Hi all,
>> >
>> > I have a few questions on how Spark is integrated with Mesos - any
>> > details, or pointers to a design document / relevant source, will be
>> > much
>> > appreciated.
>> >
>> > I'm aware of this description,
>> > https://github.com/apache/spark/blob/master/docs/running-on-mesos.md
>
>> >
>> > But its pretty high-level as far as the design is concerned, while I'm
>> > looking into lower details on how Spark actually calls the Mesos APIs,
>> > how
>> > it launches the tasks, etc
>> >
>> > Namely,
>> > 1. Does Spark creates a Mesos Framework instance for each Spark
>> > application (SparkContext)?
>> >
>> > 2. Citing from the link above,
>> >
>> > "In "fine-grained" mode (default), each Spark task runs as a separate
>> > Mesos task ... comes with an additional overhead in launching each task
>> > "
>> >
>> >
>> > Does it mean that the Mesos slave launches a Spark Executor for each
>> > task?
>> > (unlikely..) Or the slave host has a number of Spark Executors
>> > pre-launched (one per application), and sends the task to its
>> > application's executor?
>> > What is the resource offer then? Is it a host's cpu slice offered to any
>> > Framework (Spark app/context), that sends the task to run on it? Or its
>> > a
>> > 'slice of app Executor' that got idle, and is offered to its Framework?
>> >
>> > 3. "The "coarse-grained" mode will instead launch only one long-running
>> > Spark task on each Mesos machine, and dynamically schedule its own
>> > "mini-tasks" within it. "
>> >
>> > What is this special task? Is it the Spark app Executor? How these
>> > mini-tasks are different from 'regular' Spark tasks? How the resources
>> > are
>> > allocated/offered in this mode?
>> >
>> >
>> >
>> > Regards,
>> > Gidon
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
>> For additional commands, e-mail: dev-h...@spark.apache.org
>>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@spark.apache.org
For additional commands, e-mail: dev-h...@spark.apache.org

Re: Spark/Mesos

Reply via email to