[ https://issues.apache.org/jira/browse/AIRFLOW-441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15566210#comment-15566210 ]
Laura Lorenz commented on AIRFLOW-441: -------------------------------------- I haven't looked into the source at all but this has definitely bit me too. I would prefer the DagRun failure state to occur after all tasks have been attempted and are either 'success', 'failed', or 'skipped' status. > DagRuns are marked as failed as soon as one task fails > ------------------------------------------------------ > > Key: AIRFLOW-441 > URL: https://issues.apache.org/jira/browse/AIRFLOW-441 > Project: Apache Airflow > Issue Type: Bug > Reporter: Jeff Balogh > > https://github.com/apache/incubator-airflow/pull/1514 added a > [{{verify_integrity}} > function|https://github.com/apache/incubator-airflow/blob/fcf645b/airflow/models.py#L3850-L3877] > that greedily creates {{TaskInstance}} objects for all tasks in a dag. > This does not interact well with the assumptions in the new [{{update_state}} > function|https://github.com/apache/incubator-airflow/blob/fcf645b/airflow/models.py#L3816-L3825]. > The guard for {{if len(tis) == len(dag.active_tasks)}} is no longer > effective; in the old world of lazily-created tasks this code would only run > once all the tasks in the dag had run. Now it runs all the time, and as soon > as one task in a dag run fails the whole DagRun fails. This is bad since the > scheduler stops processing the DagRun after that. > In retrospect, the old code was also buggy: if your dag ends with a bunch of > Queued tasks the DagRun could be marked as failed prematurely. > I suspect the fix is to update the guard to look at tasks where the state is > success or failed. Otherwise we're evaluating and failing the dag based on > up_for_retry/queued/scheduled tasks. -- This message was sent by Atlassian JIRA (v6.3.4#6332)