[
https://issues.apache.org/jira/browse/AIRFLOW-2683?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16709955#comment-16709955
]
Rolf Schroeder commented on AIRFLOW-2683:
-----------------------------------------
I can confirm this behaviour on 1.10.0
I have also found out that the priority_weight actually only leads to "queued"
(and thus ordered by priority) tasks when the latter are assigned to a Pool.
The documentation actually mentiones priority_weight in context with pools. It
was not clear (I feel) that priority_weights are otherwise not honored (using
Celery).
> priority_weights honored by LocalExecutor but not by CeleryExecutor
> -------------------------------------------------------------------
>
> Key: AIRFLOW-2683
> URL: https://issues.apache.org/jira/browse/AIRFLOW-2683
> Project: Apache Airflow
> Issue Type: Bug
> Components: celery, scheduler
> Affects Versions: 1.7.1.3
> Environment: Linux
> Reporter: Rolf Schroeder
> Priority: Major
> Attachments: airflow_priorities_celery.png,
> airflow_priorities_dag.png, airflow_priorities_localexecutor.png,
> airflow_priorities_purge_in_the_middle.png, test_priority.py
>
>
> Hi,
> I am running Airflow 1.7.1.3 with Celery 4.0.2 (and rabbitmq). I have come
> across the following issue: It seems to me that task priorities are not
> honored as expected when scheduling tasks via Celery. However, when using the
> LocalExecutor, the priorities seem to work fine. This issue arises when low
> prio tasks get scheduled/queued before high prio tasks. Celery will finish
> all previously scheduled/queued low prio tasks before tackling the high prio
> ones. In contrast, the LocalExecutor does not further process remaining low
> prio tasks and starts the high prio tasks. I fee like once Airflow has sent
> the tasks to Celery, it "looses" control of the execution order. Details
> below. I searched in Jira and possibly the following issues are linked
> (although I am not convinced)
> (AIRFLOW-1510)https://issues.apache.org/jira/browse/AIRFLOW-1510
> (AIRFLOW-584)https://issues.apache.org/jira/browse/AIRFLOW-584
> I have the following test setup: Two independent initial tasks, each with 3
> downstream tasks. One of the initial tasks takes longer but has high prio
> downstream tasks. The other initial task has a short duration and low prio
> downstream tasks. Both initial tasks start at the same time. My expectation
> is that some of the low prio tasks get executed first (since their initial
> upstream task is faster) but the moment the slower initial taks is done, its
> high prio downstream tasks should get executed before the low prio downstream
> tasks.
> Here is a picture of the setup (forget about init0, this is just to make the
> DAG look correct):
> !airflow_priorities_dag.png!
> I've been trying this with a LocalExecutor (AIRFLOW__CORE__PARALLELISM=2) and
> two celery workers. When running the LocalExecutor, the high prio tasks get
> indeed executed once their init task is done (as expected, the 'prio100'
> tasks finish before the remaining 'prio10' tasks):
> !airflow_priorities_localexecutor.png!
> However, when using Celery, it seems that that the priorities are not honored
> anymore. Once can see clearly that the low prio tasks (prio10) are all
> executed before the high prio ones (prio100)
> !airflow_priorities_celery.png!
> Here is the corresponding code
> [^test_priority.py]
> I do not know whether this expected behavior or a bug? Is there any way to
> "fix" this?
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)