Bill Farner created AURORA-255:
----------------------------------
Summary: Tasks in a job should schedule faster
Key: AURORA-255
URL: https://issues.apache.org/jira/browse/AURORA-255
Project: Aurora
Issue Type: Story
Components: Scheduler
Reporter: Bill Farner
Assignee: Bill Farner
The scheduler has two pipelines that rate limit consumption from the queue of
pending tasks.
1.) A global rate limit for task scheduling _tuned with
max_schedule_attempts_per_sec, default 10_.
2.) A truncated binary backoff that penalizes a task group (a task group is a
collection of equivalent tasks, which is usually all of the pending tasks in a
job) when the scheduler fails to find an available slot for a task. _tuned with
initial_schedule_delay and max_schedule_delay, defaults 1 second and 30
seconds, respectively_
Combined, these features attempt to ensure the scheduler remains responsive for
other essential duties, like handling task status updates and responding to
RPCs.
The default configuration of these fields is too conservative. (1) can be
safely raised to 20, which should still reserve sufficient CPU for other
activities than PENDING task scheduling. (2) should be altered to impose no
initial penalty on a task, allowing a task that is scheduling without failure
to schedule as fast as (1) permits.
--
This message was sent by Atlassian JIRA
(v6.2#6252)