[
https://issues.apache.org/jira/browse/AIRFLOW-1884?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Grant Nicholas updated AIRFLOW-1884:
------------------------------------
Description:
Orphaned task instances are only reset for dagruns that are both not externally
triggered and not backfilled. This violates the crash safety property of the
scheduler, ie) if the scheduler crashes in the middle of one of these dagruns
then tasks can be stuck in the "Queued" state forever and never executed.
I found the changeset this regression happened in, it is this one:
https://issues.apache.org/jira/browse/AIRFLOW-1059
This change reverts the special casing logic so that externally triggered
dagruns have orphaned tasks reset on startup of the scheduler. Backfilled
dagruns are still not crash safe, so if that needs to be fixed it will be done
in another PR.
was:
Orphaned task instances are only reset for dagruns that are both not externally
triggered and not backfilled. This violates the crash safety property of the
scheduler, ie) if the scheduler crashes in the middle of one of these dagruns
then tasks can be stuck in the "Queued" state forever and never executed.
I found the changeset this regression happened in, it is this one:
https://issues.apache.org/jira/browse/AIRFLOW-1059
This change reverts the special casing logic so that externally triggered
dagruns have orphaned tasks reset on startup of the scheduler.
> Ensure scheduler is crash safe for externally triggered dagruns
> ---------------------------------------------------------------
>
> Key: AIRFLOW-1884
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1884
> Project: Apache Airflow
> Issue Type: Bug
> Reporter: Grant Nicholas
> Assignee: Grant Nicholas
>
> Orphaned task instances are only reset for dagruns that are both not
> externally triggered and not backfilled. This violates the crash safety
> property of the scheduler, ie) if the scheduler crashes in the middle of one
> of these dagruns then tasks can be stuck in the "Queued" state forever and
> never executed.
> I found the changeset this regression happened in, it is this one:
> https://issues.apache.org/jira/browse/AIRFLOW-1059
> This change reverts the special casing logic so that externally triggered
> dagruns have orphaned tasks reset on startup of the scheduler. Backfilled
> dagruns are still not crash safe, so if that needs to be fixed it will be
> done in another PR.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)