Re: Spark on Mesos / Executor Memory

Tom Arnfeld Sat, 11 Apr 2015 12:29:25 -0700

Thanks for sharing the details Tim. Though I agree with James here, the
approach to cap cores doesn't really solve the underlying problem. In our
case we're running several frameworks, all of which consume varying amounts
of resources throughout their lifetime. When the cluster is busy this
results in lots of slaves tightly packed meaning when resources become
available we want to ensure frameworks have the ability to do _something_
if not at their desired capacity, ultimately this evens out over time to a
fair share.


An example of where this becomes a problem with spark, if we cap the cores
at 5 CPUs (in our case we'd see only a few executors per slave) we can set
the memory limit lower. However this means that if the framework can only
get 1 CPU for a slave it's going to be requiring a lot more memory than it
really needs, and that may not be available, so nothing get's launched.

> It also might be interesting to include a cores to memory multiplier so
that with a larger amount of cores we try to scale the memory with some
factor, but I'm not entirely sure that's intuitive to use and what people
know what to set it to, as that can likely change with different workload.

A cores multiplier is definitely an interesting route to go down, I think
specifying the memory for the executor on it's own and adding in a
multiplication of some memory value * CPUs allocated would go towards
helping solve the problem. We're actually using coarse mode but I think the
same sort of issue still stands for fine grained, in fact it would probably
be worse, because the number of tasks per executor is a lot more fluid.

Tom.

On 11 April 2015 at 21:05, CCAAT <[email protected]> wrote:

> Hello Tim,
>
> Your approach seems most reasonable, particularly from an over arching
> viewpoint. However, it occurs to me the that as folks have several to many
> different frameworks (distributed applications)  running on a given mesos
> cluster, that the optimization of resource allocation (utilization) may
> ultimately need to be under some sort of tunable, dynamic scheme. Most
> distributed application, say it runs for a few hours, will usually not have
> a constant resource demand on memory  so how can any static configuration
> suffice for a dynamic mix of frequently changing distributed application
> work well with static configurations. This is particularly amplified as a
> problem, where
> Apache-spark is an "in-memory" resource demand, that is very different
> than other frameworks that may be active on the same cluster.
>
> I really think we are just experiencing the tip of the iceberg here
> as these mesos clusters grow, expand and take on a variety of problems,
> or did I miss some already existing robustness in the codes?
>
>
> James
>
>
>
> On 04/11/2015 12:29 PM, Tim Chen wrote:
>
>> (Adding spark user list)
>>
>> Hi Tom,
>>
>> If I understand correctly you're saying that you're running into memory
>> problems because the scheduler is allocating too much CPUs and not
>> enough memory to acoomodate them right?
>>
>> In the case of fine grain mode I don't think that's a problem since we
>> have a fixed amount of CPU and memory per task.
>> However, in coarse grain you can run into that problem if you're with in
>> the spark.cores.max limit, and memory is a fixed number.
>>
>> I have a patch out to configure how much max cpus should coarse grain
>> executor use, and it also allows multiple executors in coarse grain
>> mode. So you could say try to launch multiples of max 4 cores with
>> spark.executor.memory (+ overhead and etc) in a slave.
>> (https://github.com/apache/spark/pull/4027)
>>
>> It also might be interesting to include a cores to memory multiplier so
>> that with a larger amount of cores we try to scale the memory with some
>> factor, but I'm not entirely sure that's intuitive to use and what
>> people know what to set it to, as that can likely change with different
>> workload.
>>
>> Tim
>>
>>
>>
>>
>>
>>
>>
>> On Sat, Apr 11, 2015 at 9:51 AM, Tom Arnfeld <[email protected]
>> <mailto:[email protected]>> wrote:
>>
>>     We're running Spark 1.3.0 (with a couple of patches over the top for
>>     docker related bits).
>>
>>     I don't think SPARK-4158 is related to what we're seeing, things do
>>     run fine on the cluster, given a ridiculously large executor memory
>>     configuration. As for SPARK-3535 although that looks useful I think
>>     we'e seeing something else.
>>
>>     Put a different way, the amount of memory required at any given time
>>     by the spark JVM process is directly proportional to the amount of
>>     CPU it has, because more CPU means more tasks and more tasks means
>>     more memory. Even if we're using coarse mode, the amount of executor
>>     memory should be proportionate to the amount of CPUs in the offer.
>>
>>     On 11 April 2015 at 17:39, Brenden Matthews <[email protected]
>>     <mailto:[email protected]>> wrote:
>>
>>         I ran into some issues with it a while ago, and submitted a
>>         couple PRs to fix it:
>>
>>         https://github.com/apache/spark/pull/2401
>>         https://github.com/apache/spark/pull/3024
>>
>>         Do these look relevant? What version of Spark are you running?
>>
>>         On Sat, Apr 11, 2015 at 9:33 AM, Tom Arnfeld <[email protected]
>>         <mailto:[email protected]>> wrote:
>>
>>             Hey,
>>
>>             Not sure whether it's best to ask this on the spark mailing
>>             list or the mesos one, so I'll try here first :-)
>>
>>             I'm having a bit of trouble with out of memory errors in my
>>             spark jobs... it seems fairly odd to me that memory
>>             resources can only be set at the executor level, and not
>>             also at the task level. For example, as far as I can tell
>>             there's only a *spark.executor.memory* config option.
>>
>>             Surely the memory requirements of a single executor are
>>             quite dramatically influenced by the number of concurrent
>>             tasks running? Given a shared cluster, I have no idea what %
>>             of an individual slave my executor is going to get, so I
>>             basically have to set the executor memory to a value that's
>>             correct when the whole machine is in use...
>>
>>             Has anyone else running Spark on Mesos come across this, or
>>             maybe someone could correct my understanding of the config
>>             options?
>>
>>             Thanks!
>>
>>             Tom.
>>
>>
>>
>>
>>
>

Re: Spark on Mesos / Executor Memory

Reply via email to