[jira] [Updated] (AIRFLOW-72) Implement proper capacity scheduler

Ry Walker (JIRA) Tue, 16 Apr 2019 10:39:23 -0700


     [ 
https://issues.apache.org/jira/browse/AIRFLOW-72?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Ry Walker updated AIRFLOW-72:
-----------------------------
    Affects Version/s:     (was: 1.7.1)

> Implement proper capacity scheduler
> -----------------------------------
>
>                 Key: AIRFLOW-72
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-72
>             Project: Apache Airflow
>          Issue Type: Improvement
>          Components: pools, scheduler
>            Reporter: Bolke de Bruin
>            Priority: Major
>              Labels: pool, queue, scheduler
>             Fix For: 2.0.0
>
>
> The scheduler is supposed to maintain queues and pools according to a 
> "capacity" model. However it is currently not properly implemented as 
> therefore issues as being able to oversubscribe to pools exist, race 
> conditions for queuing/dequeuing exist and probably others.
> This Jira Epic is to track all related issues to pooling/queuing and the 
> (tbd) roadmap to a proper capacity scheduler.
> Why queuing / scheduling broken:
> Locking is not properly implemented and cannot be as a check for slot 
> availability is spread throughout the scheduler, taskinstance and executor. 
> This makes obtaining a slot non-atomic and results in over subscribing. In 
> addition it leads to race conditions as having two tasks being picked from 
> the queue at the same time as the scheduler determines that a queued task 
> still needs to be send to the executor, while in an earlier run this already 
> happened.
> In order to fix this Pool handling needs to be centralized (code wise) and 
> work with a mutex (with_for_update()) on the database records. The 
> scheduler/taskinstance can then do something like:
> slot = Pool.obtain_slot(pool_id)
> Pool.release_slot(slot)



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Updated] (AIRFLOW-72) Implement proper capacity scheduler

Reply via email to