[
https://issues.apache.org/jira/browse/AIRFLOW-1296?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bolke de Bruin updated AIRFLOW-1296:
------------------------------------
Affects Version/s: 1.8.1
> DAGs using operators involving cascading skipped tasks fail prematurely
> -----------------------------------------------------------------------
>
> Key: AIRFLOW-1296
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1296
> Project: Apache Airflow
> Issue Type: Bug
> Components: scheduler
> Affects Versions: 1.8.1
> Reporter: Daniel Huang
> Priority: Blocker
> Fix For: 1.8.2
>
>
> So this is basically the same issue as AIRFLOW-872 and AIRFLOW-719. A
> workaround had fixed this
> (https://github.com/apache/incubator-airflow/pull/2125), but was later
> reverted (https://github.com/apache/incubator-airflow/pull/2195). I totally
> agree with the reason for reverting, but I still think this is an issue.
> The issue is related to any operators that involves cascading skipped tasks,
> like ShortCircuitOperator or LatestOnlyOperator. These operators mark only
> their *direct* downstream task as SKIPPED, but additional downstream tasks
> from that skipped task is left up to the scheduler to cascade the SKIPPED
> state (see latest only op docs about this expected behavior
> https://airflow.incubator.apache.org/concepts.html#latest-run-only). However,
> instead the scheduler marks the DAG run as FAILED prematurely before the DAG
> has a chance to skip all downstream tasks.
> This example DAG should reproduce the issue:
> https://gist.github.com/dhuang/61d38fb001c3a917edf4817bb0c915f9.
> Expected result: DAG succeeds with tasks - latest_only (success) -> dummy1
> (skipped) -> dummy2 (skipped) -> dummy3 (skipped)
> Actual result: DAG fails with tasks - latest_only (success) -> dummy1
> (skipped) -> dummy2 (none) -> dummy3 (none)
> I believe the results I'm seeing are because of this deadlock prevention
> logic,
> https://github.com/apache/incubator-airflow/blob/1.8.1/airflow/models.py#L4182.
> While that actual result shown above _could_ mean a deadlock, in this case
> it shouldn't be. Since this {{update_state}} logic is reached first in each
> scheduler run, dummy2/dummy3 don't get a chance to cascade the SKIPPED state.
> Commenting out that block gives me the results I expect.
> [~bolke] I know you spent awhile trying to reproduce my issue and weren't
> able to, but I'm still hitting this on a fresh environment, default configs,
> sqlite/mysql dbs, local/sequential/celery executors, and 1.8.1/master.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)