Changing the UI to account for this might be a good idea now that we
support other executors and multiple executors. Unfortunately, it's not
going to be easy because we don't even persist overhead per task. Would you
mind filing a ticket to track this enhancement?

Note that pushing overhead too low means that it's possible to create tasks
with such small CPU that the executor cannot start. This means tasks will
fail to launch. I derived the initial value empirically from observing
overhead in a large cluster and rounded up. I think the executor needs 0.1
cores at least to start a task and consumes ~0.05 cores afterwards to
conduct health checks and monitor process state.

Note that if you have health checks and too little CPU allocated to your
tasks, it means that you might deny CPU to the process that is being health
checked, causing it to fail randomly.

On Wed, Sep 7, 2016 at 3:18 PM, Stephan Erb <[email protected]> wrote:

> Personally, I would not mind if we drop the executor overhead completely
> and ask the users add it on their own. We would probably have to enforce a
> minimal task size to prevent Thermos OOMs, but that should not be a big
> problem.
>
>
>
> On Mi, 2016-09-07 at 16:41 -0400, Rick Mangi wrote:
>
> One of the problems we saw from this was that aurora doesn’t seem include
> the thermos overhead when computing allocated resources, so we were seeing
> a huge gap between what aurora said we were reserving and what mesos said
> was available. Perhaps the aurora UI should take the thermos executor
> overhead into account when computing used resources.
>
>
> On Sep 7, 2016, at 4:17 PM, Joshua Cohen <[email protected]> wrote:
>
> We run internally with -thermos_executor_cpu set to 0 (requiring task
> owners to account for any executor CPU usage). This is generally safe, but
> task owners should be notified that there's an outside chance they might
> see CPU throttling that they were not previously seeing (assuming you're
> using cgroup/cpu isolation that is).
>
> On Wed, Sep 7, 2016 at 2:44 PM, Wesley Chow <[email protected]> wrote:
>
> It’s currently set to a default of 0.25, which seems excessive to us since
> we tend to run a larger number of small tasks. Is bringing that down to 0.1
> a terrible thing to do?
>
> Thanks,
> Wes
>
>
>
>
>

Reply via email to