[ 
https://issues.apache.org/jira/browse/AIRFLOW-2270?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16419905#comment-16419905
 ] 

Winston Huang commented on AIRFLOW-2270:
----------------------------------------

PR with a possible fix: https://github.com/apache/incubator-airflow/pull/3176

> Subdag backfill spins on removed tasks
> --------------------------------------
>
>                 Key: AIRFLOW-2270
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-2270
>             Project: Apache Airflow
>          Issue Type: Bug
>            Reporter: Winston Huang
>            Priority: Major
>
> My understanding is that subdag operators execute via a backfill job which 
> runs in a loop, maintaining the state of the associated tasks and breaking 
> only once all pending tasks have been exhausted: 
> [https://github.com/apache/incubator-airflow/blob/64206615a790c90893d5836da8d2f7159bda23ac/airflow/jobs.py#L2159]
>  
> The issue is that this task instance status is initialized by this method 
> [https://github.com/apache/incubator-airflow/blob/64206615a790c90893d5836da8d2f7159bda23ac/airflow/jobs.py#L2075,]
>  which may include tasks with {{state = State.REMOVED}}, i.e. tasks that were 
> previously instantiated in the database but removed from the dag definition. 
> Hence, the task will be missing from this list 
> [https://github.com/apache/incubator-airflow/blob/64206615a790c90893d5836da8d2f7159bda23ac/airflow/jobs.py#L2168]
>  but will exist in {{ti_status.to_run}}. This causes the backfill job to loop 
> indefinitely, since it considers those removed tasks to be pending but 
> doesn't attempt to run them.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to