Re: Removing the Mesos fine-grained mode

Andrew Or Mon, 23 Nov 2015 11:43:11 -0800

@Jerry Lam

Can someone confirm if it is true that dynamic allocation on mesos "is
> designed to run one executor per slave with the configured amount of
> resources." I copied this sentence from the documentation. Does this mean
> there is at most 1 executor per node? Therefore,  if you have a big
> machine, you need to allocate a fat executor on this machine in order to
> fully utilize it?



Mesos inherently does not support multiple executors per slave currently.
This is actually not related to dynamic allocation. There is, however, an
outstanding patch to add support for multiple executors per slave. When
that feature is merged, it will work well with dynamic allocation.


2015-11-23 9:27 GMT-08:00 Adam McElwee <a...@mcelwee.me>:

>
>
> On Mon, Nov 23, 2015 at 7:36 AM, Iulian Dragoș <iulian.dra...@typesafe.com
> > wrote:
>
>>
>>
>> On Sat, Nov 21, 2015 at 3:37 AM, Adam McElwee <a...@mcelwee.me> wrote:
>>
>>> I've used fine-grained mode on our mesos spark clusters until this week,
>>> mostly because it was the default. I started trying coarse-grained because
>>> of the recent chatter on the mailing list about wanting to move the mesos
>>> execution path to coarse-grained only. The odd things is, coarse-grained vs
>>> fine-grained seems to yield drastic cluster utilization metrics for any of
>>> our jobs that I've tried out this week.
>>>
>>> If this is best as a new thread, please let me know, and I'll try not to
>>> derail this conversation. Otherwise, details below:
>>>
>>
>> I think it's ok to discuss it here.
>>
>>
>>> We monitor our spark clusters with ganglia, and historically, we
>>> maintain at least 90% cpu utilization across the cluster. Making a single
>>> configuration change to use coarse-grained execution instead of
>>> fine-grained consistently yields a cpu utilization pattern that starts
>>> around 90% at the beginning of the job, and then it slowly decreases over
>>> the next 1-1.5 hours to level out around 65% cpu utilization on the
>>> cluster. Does anyone have a clue why I'd be seeing such a negative effect
>>> of switching to coarse-grained mode? GC activity is comparable in both
>>> cases. I've tried 1.5.2, as well as the 1.6.0 preview tag that's on github.
>>>
>>
>> I'm not very familiar with Ganglia, and how it computes utilization. But
>> one thing comes to mind: did you enable dynamic allocation
>> <https://spark.apache.org/docs/latest/running-on-mesos.html#dynamic-resource-allocation-with-mesos>
>> on coarse-grained mode?
>>
>
> Dynamic allocation is definitely not enabled. The only delta between runs
> is adding --conf "spark.mesos.coarse=true" the job submission. Ganglia is
> just pulling stats from the procfs, and I've never seen it report bad
> results. If I sample any of the 100-200 nodes in the cluster, dstat
> reflects the same average cpu that I'm seeing reflected in ganglia.
>
>>
>> iulian
>>
>
>

Re: Removing the Mesos fine-grained mode

Reply via email to