[ https://issues.apache.org/jira/browse/AIRFLOW-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16168195#comment-16168195 ]
Tylar Hoag commented on AIRFLOW-1296: ------------------------------------- After applying the update to my test environment I didn't see the problem fixed as described. I am use the CeleryExecutor. My issue was with the PythonBranchOperator, and after reviewing the code I didn't see updates that would cause all downstream tasks to be skipped. Only the immediately downstream tasks are skipped. I added this small code change to recursively skip all downstream tasks just to illustrate the behavior which I am looking for: https://github.com/magnuschill/incubator-airflow/commit/9ba11903ff3e34e18a072719f38918a274ada2d1 If I'm not understanding this issue correctly could someone please elaborate. Otherwise, i'd be happy to polish the solution i'm proposing and submit it as a proper issue and PR. > DAGs using operators involving cascading skipped tasks fail prematurely > ----------------------------------------------------------------------- > > Key: AIRFLOW-1296 > URL: https://issues.apache.org/jira/browse/AIRFLOW-1296 > Project: Apache Airflow > Issue Type: Bug > Components: scheduler > Affects Versions: 1.8.1 > Reporter: Daniel Huang > Assignee: Bolke de Bruin > Priority: Blocker > Fix For: 1.8.2 > > > So this is basically the same issue as AIRFLOW-872 and AIRFLOW-719. A > workaround had fixed this > (https://github.com/apache/incubator-airflow/pull/2125), but was later > reverted (https://github.com/apache/incubator-airflow/pull/2195). I totally > agree with the reason for reverting, but I still think this is an issue. > The issue is related to any operators that involves cascading skipped tasks, > like ShortCircuitOperator or LatestOnlyOperator. These operators mark only > their *direct* downstream task as SKIPPED, but additional downstream tasks > from that skipped task is left up to the scheduler to cascade the SKIPPED > state (see latest only op docs about this expected behavior > https://airflow.incubator.apache.org/concepts.html#latest-run-only). However, > instead the scheduler marks the DAG run as FAILED prematurely before the DAG > has a chance to skip all downstream tasks. > This example DAG should reproduce the issue: > https://gist.github.com/dhuang/61d38fb001c3a917edf4817bb0c915f9. > Expected result: DAG succeeds with tasks - latest_only (success) -> dummy1 > (skipped) -> dummy2 (skipped) -> dummy3 (skipped) > Actual result: DAG fails with tasks - latest_only (success) -> dummy1 > (skipped) -> dummy2 (none) -> dummy3 (none) > I believe the results I'm seeing are because of this deadlock prevention > logic, > https://github.com/apache/incubator-airflow/blob/1.8.1/airflow/models.py#L4182. > While that actual result shown above _could_ mean a deadlock, in this case > it shouldn't be. Since this {{update_state}} logic is reached first in each > scheduler run, dummy2/dummy3 don't get a chance to cascade the SKIPPED state. > Commenting out that block gives me the results I expect. > [~bolke] I know you spent awhile trying to reproduce my issue and weren't > able to, but I'm still hitting this on a fresh environment, default configs, > sqlite/mysql dbs, local/sequential/celery executors, and 1.8.1/master. -- This message was sent by Atlassian JIRA (v6.4.14#64029)