[ https://issues.apache.org/jira/browse/AIRFLOW-192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16309206#comment-16309206 ]
Pratap Naik commented on AIRFLOW-192: ------------------------------------- I think setting the priority to -1 for all tasks does the trick... > Implement priority_weight aggregation using ancestors (rather than successors) > ------------------------------------------------------------------------------ > > Key: AIRFLOW-192 > URL: https://issues.apache.org/jira/browse/AIRFLOW-192 > Project: Apache Airflow > Issue Type: Improvement > Components: operators > Affects Versions: Airflow 1.7.1.2 > Reporter: Sergei Iakhnin > > Currently tasks are being scheduled based on the priority_weight. The > effective priority of a task is it's own priority plus the priorities of all > tasks that follow it in a dag. This results in undesirable scheduling > behaviour in my use case. > My use case involves running scientific workflows where a number of > operations are being carried out on a set of samples in a set. Each sample is > handled by a separate dag run that is manually triggered. It is common for > several thousand dag instances to be in flight at a given time. The dag > reserves a sample, operates on it, and then releases it. I would like for > each sample to be reserved for as short a time as possible, so that other > programs can have an opportunity to operate on it and dag runs can complete > as fast as possible. However, because of the current priority logic, if I > were to schedule several thousand dags at a given time, they would first all > execute their first state, then all execute their second state, etc. Thus, no > dag can complete fully, until all dags complete their second last state. This > results in unnecessarily long dag run times and simultaneous completion of > all dags. > Ideally, Airflow would support the reverse of the current logic used for > priorities i.e. a task's priority is the sum of priorities of all its > ancestors. This way, the further along a dag is in its processing the more > likely its tasks will get scheduled (thus leading to a shorter completion > time, and release of its resources). > Also, a nominal priority mode would be useful, where a task's priority is > exactly the number given to it by the author, in order to allow more > scheduling flexibility. -- This message was sent by Atlassian JIRA (v6.4.14#64029)