[ 
https://issues.apache.org/jira/browse/AIRFLOW-3344?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Steve Jacobs updated AIRFLOW-3344:
----------------------------------
    Description: 
When using the airflow clear command from the cli, you can pass an argument 
--only_failed to clear only failed tasks. This will clear ONLY tasks with the 
state failed, and not tasks with the state upstream_failed, causing any clear 
to still fail the dag_run if any upstream tasks are failed.

Since one_failed as a trigger rule also checks for upstream_failed tasks, it 
seems consistent that this should also clear upstream_failed tasks. The 
relevant code change necessary is here:
{code:java}
if only_failed:
 tis = tis.filter(TI.state == State.FAILED)
{code}
to
{code:java}
if only_failed:
  tis = tis.filter(TI.state.in_([State.FAILED, State.UPSTREAM_FAILED]))
{code}
in models.py

Additionally when clearing dags, the dag_run is set to the running state, but 
the dag_run start_date is not updated to the current time, as it is when 
clearing tasks through the Web UI. This causes dag_runs to fail on their 
timeouts even if the dag is full of successful tasks. This needs to be changed 
as well.

  was:
When using the airflow clear command from the cli, you can pass an argument 
--only_failed to clear only failed tasks. This will clear ONLY tasks with the 
state failed, and not tasks with the state upstream_failed, causing any clear 
to still fail the dag_run if any upstream tasks are failed.

Since one_failed as a trigger rule also checks for upstream_failed tasks, it 
seems consistent that this should also clear upstream_failed tasks. The 
relevant code change necessary is here:
{code:java}
if only_failed:
 tis = tis.filter(TI.state == State.FAILED)
{code}
to
{code:java}
if only_failed:
  tis = tis.filter(TI.state.in_([State.FAILED, State.UPSTREAM_FAILED]))
{code}
in models.py


> Airflow DAG object clear function does not clear tasks in the upstream_failed 
> state when only_failed=True
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: AIRFLOW-3344
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-3344
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: DAG
>    Affects Versions: 1.8.2, 1.9.0, 1.10.0
>            Reporter: Steve Jacobs
>            Priority: Minor
>              Labels: easyfix, newbie
>   Original Estimate: 1h
>  Remaining Estimate: 1h
>
> When using the airflow clear command from the cli, you can pass an argument 
> --only_failed to clear only failed tasks. This will clear ONLY tasks with the 
> state failed, and not tasks with the state upstream_failed, causing any clear 
> to still fail the dag_run if any upstream tasks are failed.
> Since one_failed as a trigger rule also checks for upstream_failed tasks, it 
> seems consistent that this should also clear upstream_failed tasks. The 
> relevant code change necessary is here:
> {code:java}
> if only_failed:
>  tis = tis.filter(TI.state == State.FAILED)
> {code}
> to
> {code:java}
> if only_failed:
>   tis = tis.filter(TI.state.in_([State.FAILED, State.UPSTREAM_FAILED]))
> {code}
> in models.py
> Additionally when clearing dags, the dag_run is set to the running state, but 
> the dag_run start_date is not updated to the current time, as it is when 
> clearing tasks through the Web UI. This causes dag_runs to fail on their 
> timeouts even if the dag is full of successful tasks. This needs to be 
> changed as well.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to