[ 
https://issues.apache.org/jira/browse/AIRFLOW-5391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Qian Yu reassigned AIRFLOW-5391:
--------------------------------

    Assignee: Qian Yu

> Clearing a task skipped by BranchPythonOperator will cause the task to execute
> ------------------------------------------------------------------------------
>
>                 Key: AIRFLOW-5391
>                 URL: https://issues.apache.org/jira/browse/AIRFLOW-5391
>             Project: Apache Airflow
>          Issue Type: Bug
>          Components: operators
>    Affects Versions: 1.10.4
>            Reporter: Qian Yu
>            Assignee: Qian Yu
>            Priority: Major
>
> I tried this on 1.10.3 and 1.10.4, both have this issue: 
> E.g. in this example from the doc, branch_a executed, branch_false was 
> skipped because of branching condition. However if someone Clear 
> branch_false, it'll cause branch_false to execute. 
> !https://airflow.apache.org/_images/branch_good.png!
> This behaviour is understandable given how BranchPythonOperator is 
> implemented. BranchPythonOperator does not store its decision anywhere. It 
> skips its own downstream tasks in the branch at runtime. So there's currently 
> no way for branch_false to know it should be skipped without rerunning the 
> branching task.
> This is obviously counter-intuitive from the user's perspective. In this 
> example, users would not expect branch_false to execute when they clear it 
> because the branching task should have skipped it.
> There are a few ways to improve this:
> Option 1): Make downstream tasks skipped by BranchPythonOperator not 
> clearable without also clearing the upstream BranchPythonOperator. In this 
> example, if someone clears branch_false without clearing branching, the Clear 
> action should just fail with an error telling the user he needs to clear the 
> branching task as well.
> Option 2): Make BranchPythonOperator store the result of its skip condition 
> somewhere. Make downstream tasks check for this stored decision and skip 
> themselves if they should have been skipped by the condition. This probably 
> means the decision of BranchPythonOperator needs to be stored in the db.
>  
> [kevcampb|https://blog.diffractive.io/author/kevcampb/] attempted a 
> workaround and on this blog. And he acknowledged his workaround is not 
> perfect and a better permanent fix is needed:
> [https://blog.diffractive.io/2018/08/07/replacement-shortcircuitoperator-for-airflow/]
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to