[ 
https://issues.apache.org/jira/browse/AIRFLOW-5191?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16942326#comment-16942326
 ] 

Michael A Perez commented on AIRFLOW-5191:
------------------------------------------

Hello Oliver,

I recently had to address this issue with a DAG that attempts a CloudSQLImport 
and falls back to a bulk upsert if the import task fails.

Basically, when a task instance fails 
([https://github.com/apache/airflow/blob/master/airflow/models/taskinstance.py#L1047])
 it announces it and sets it's self to failed. I've found that updating the 
task after the fact, like in a PythonBranchOperator, doesn't do the trick, 
however overriding {{on_failure_callback}} does!

The function I pass as my task's {{on_failure_callback}} parameter looks like:
{code:python}
def unfail(context):
     """ Needed to prevent task fail from propagating """
     context['ti'].set_state(state.State.SKIPPED)
{code}


TL;DR you gotta change the failing task's state using it's 
{{on_failure_callback}} parameter

> SubDag is marked failed 
> ------------------------
>
>                 Key: AIRFLOW-5191
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-5191
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: DAG, DagRun
>    Affects Versions: 1.10.4
>         Environment: CentOS 7, Maria-DB, python 3.6.7, Airflow 1.10.4
>            Reporter: Oliver Ricken
>            Priority: Blocker
>
> Dear all,
> after having upgraded from Airflow version 1.10.2 to 1.10.4, we experience 
> strange and very problematic behaviour of SubDags (which are crucial for our 
> environment and used frequently).
> Tasks inside the SubDag failing and awaiting retry ("up-for-retry") mark the 
> SubDag "failed" (while in 1.10.2, the SubDag was still in "running"-state). 
> This is particularly problematic for downstream tasks depending on the state 
> of the SubDag. Since we have downstream tasks triggered on "all_done", the 
> downstream task is triggered by the "failed" SubDag although a 
> SubDag-internal task is awaiting retry and might (in our case: most likely) 
> yield successfully processed data. This data is thus not available to the 
> prematurely triggered task downstream of the SubDag.
> This is a severe problem for us and worth rolling back to 1.10.2 if there is 
> no quick solution or work-around to this issue!
> We urgently need help on this matter.
> Thanks allot in advance, any suggestions and input is highly appreciated!
> Cheers
> Oliver



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to