[
https://issues.apache.org/jira/browse/AIRFLOW-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Steve Jacobs updated AIRFLOW-3344:
----------------------------------
Description:
When using the airflow clear command from the cli, you can pass an argument
--only_failed to clear only failed tasks. This will clear ONLY tasks with the
state failed, and not tasks with the state upstream_failed, causing any clear
to still fail the dag_run if any upstream tasks are failed.
Since one_failed as a trigger rule also checks for upstream_failed tasks, it
seems consistent that this should also clear upstream_failed tasks. The
relevant code change necessary is here:
{code:java}
if only_failed:
tis = tis.filter(TI.state == State.FAILED)
{code}
to
{code:java}
if only_failed:
tis = tis.filter(TI.state.in_([State.FAILED, State.UPSTREAM_FAILED]))
{code}
in models.py
Additionally when clearing dags, the dag_run is set to the running state, but
the dag_run start_date is not updated to the current time, as it is when
clearing tasks through the Web UI. This causes dag_runs to fail on their
timeouts even if the dag is full of successful tasks. This needs to be changed
as well.
was:
When using the airflow clear command from the cli, you can pass an argument
--only_failed to clear only failed tasks. This will clear ONLY tasks with the
state failed, and not tasks with the state upstream_failed, causing any clear
to still fail the dag_run if any upstream tasks are failed.
Since one_failed as a trigger rule also checks for upstream_failed tasks, it
seems consistent that this should also clear upstream_failed tasks. The
relevant code change necessary is here:
{code:java}
if only_failed:
tis = tis.filter(TI.state == State.FAILED)
{code}
to
{code:java}
if only_failed:
tis = tis.filter(TI.state.in_([State.FAILED, State.UPSTREAM_FAILED]))
{code}
in models.py
> Airflow DAG object clear function does not clear tasks in the upstream_failed
> state when only_failed=True
> ---------------------------------------------------------------------------------------------------------
>
> Key: AIRFLOW-3344
> URL: https://issues.apache.org/jira/browse/AIRFLOW-3344
> Project: Apache Airflow
> Issue Type: Bug
> Components: DAG
> Affects Versions: 1.8.2, 1.9.0, 1.10.0
> Reporter: Steve Jacobs
> Priority: Minor
> Labels: easyfix, newbie
> Original Estimate: 1h
> Remaining Estimate: 1h
>
> When using the airflow clear command from the cli, you can pass an argument
> --only_failed to clear only failed tasks. This will clear ONLY tasks with the
> state failed, and not tasks with the state upstream_failed, causing any clear
> to still fail the dag_run if any upstream tasks are failed.
> Since one_failed as a trigger rule also checks for upstream_failed tasks, it
> seems consistent that this should also clear upstream_failed tasks. The
> relevant code change necessary is here:
> {code:java}
> if only_failed:
> tis = tis.filter(TI.state == State.FAILED)
> {code}
> to
> {code:java}
> if only_failed:
> tis = tis.filter(TI.state.in_([State.FAILED, State.UPSTREAM_FAILED]))
> {code}
> in models.py
> Additionally when clearing dags, the dag_run is set to the running state, but
> the dag_run start_date is not updated to the current time, as it is when
> clearing tasks through the Web UI. This causes dag_runs to fail on their
> timeouts even if the dag is full of successful tasks. This needs to be
> changed as well.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)