[
https://issues.apache.org/jira/browse/AIRFLOW-931?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15891162#comment-15891162
]
Bolke de Bruin edited comment on AIRFLOW-931 at 3/1/17 10:10 PM:
-----------------------------------------------------------------
I don't think your analysis is correct. Line 1291 in models.py is only executed
when "if not runnable and not mark_success:". This means required dependencies
are not met for running the task.
In addition your PR can result in Tasks being executed twice as QUEUED is the
task state when they enter the executor and is the state before they are
RUNNING.
was (Author: bolke):
I don't think your analysis is correct. Line 1291 in models.py is only executed
when "if not runnable and not mark_success:". This means required dependencies
are not met.
> LocalExecutor fails to run queued task with race condition
> ----------------------------------------------------------
>
> Key: AIRFLOW-931
> URL: https://issues.apache.org/jira/browse/AIRFLOW-931
> Project: Apache Airflow
> Issue Type: Bug
> Affects Versions: Airflow 1.8, 1.8.0rc4
> Reporter: Vijay Krishna Ramesh
>
> https://gist.github.com/vijaykramesh/707262c83429ab2a3f5ee701879813e3
> provides a small example that consistently hits this problem with
> LocalExecutor.
> Basically when the dag run kicks off (with concurrency > 1) and a
> LocalExecutor with parallelism > 2 the scheduler marks more than concurrency
> tasks as queued
> (https://github.com/apache/incubator-airflow/blob/master/airflow/jobs.py#L1095)
> There is a second check before actually running the task
> (https://github.com/apache/incubator-airflow/blob/master/airflow/models.py#L1291)
> that leaves the task in the QUEUED state but then the scheduler never picks
> it back up. This causes the DAG to get stuck (as the queued tasks never run)
> until the scheduler is restarted (at which point the enqueued tasks are
> considered orphaned, the status is set to NONE, and then they are picked up
> by the scheduler again and run.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)