Bill Farner created AURORA-255:
----------------------------------

             Summary: Tasks in a job should schedule faster
                 Key: AURORA-255
                 URL: https://issues.apache.org/jira/browse/AURORA-255
             Project: Aurora
          Issue Type: Story
          Components: Scheduler
            Reporter: Bill Farner
            Assignee: Bill Farner


The scheduler has two pipelines that rate limit consumption from the queue of 
pending tasks.

1.) A global rate limit for task scheduling _tuned with 
max_schedule_attempts_per_sec, default 10_.

2.) A truncated binary backoff that penalizes a task group (a task group is a 
collection of equivalent tasks, which is usually all of the pending tasks in a 
job) when the scheduler fails to find an available slot for a task. _tuned with 
initial_schedule_delay and max_schedule_delay, defaults 1 second and 30 
seconds, respectively_

Combined, these features attempt to ensure the scheduler remains responsive for 
other essential duties, like handling task status updates and responding to 
RPCs.

The default configuration of these fields is too conservative.  (1) can be 
safely raised to 20, which should still reserve sufficient CPU for other 
activities than PENDING task scheduling.  (2) should be altered to impose no 
initial penalty on a task, allowing a task that is scheduling without failure 
to schedule as fast as (1) permits.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to