[
https://issues.apache.org/jira/browse/AIRFLOW-931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15896260#comment-15896260
]
Bolke de Bruin commented on AIRFLOW-931:
----------------------------------------
[~v_krishna] please see https://github.com/apache/incubator-airflow/pull/2127
and test it.
> LocalExecutor fails to run queued task with race condition
> ----------------------------------------------------------
>
> Key: AIRFLOW-931
> URL: https://issues.apache.org/jira/browse/AIRFLOW-931
> Project: Apache Airflow
> Issue Type: Sub-task
> Affects Versions: Airflow 1.8, 1.8.0rc4
> Reporter: Vijay Krishna Ramesh
> Assignee: Bolke de Bruin
>
> https://gist.github.com/vijaykramesh/707262c83429ab2a3f5ee701879813e3
> provides a small example that consistently hits this problem with
> LocalExecutor.
> Basically when the dag run kicks off (with concurrency > 1) and a
> LocalExecutor with parallelism > 2 the scheduler marks more than concurrency
> tasks as queued
> (https://github.com/apache/incubator-airflow/blob/master/airflow/jobs.py#L1095)
> There is a second check before actually running the task
> (https://github.com/apache/incubator-airflow/blob/master/airflow/models.py#L1291)
> that leaves the task in the QUEUED state but then the scheduler never picks
> it back up. This causes the DAG to get stuck (as the queued tasks never run)
> until the scheduler is restarted (at which point the enqueued tasks are
> considered orphaned, the status is set to NONE, and then they are picked up
> by the scheduler again and run.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)