For some resources (like disk, or more acutely - RAM), there's not much we
can do to provide assurances.  Ultimately resource-driven task termination
is managed at the node level, and may represent a real exhaustion of the
resource.  I'd be worried that trying to augment this might trade one
problem for another - where the rationale for killing a task becomes
non-deterministic, or even error-prone.

On Wed, Oct 28, 2015 at 3:45 PM, Josh Adams <[email protected]> wrote:

> Good afternoon all,
>
> Is it possible to tell the scheduler to throttle kill rates for a given
> job? When all tasks in a job start consuming too much disk or ram because
> of an unexpected service dependency meltdown it would be nice if we had a
> little buffer time to triage the issue without the scheduler killing them
> all en masse for using more than their allocated resources simultaneously...
>
> Cheers,
> Josh
>

Reply via email to