Gero Vermaas created AIRFLOW-4581:
-------------------------------------
Summary: After using a backfill for a DAG run, it will no run
anymore after a clear
Key: AIRFLOW-4581
URL: https://issues.apache.org/jira/browse/AIRFLOW-4581
Project: Apache Airflow
Issue Type: Bug
Components: scheduler
Affects Versions: 1.9.0
Reporter: Gero Vermaas
Hi,
After successfully running and completing a backfill for a specific DAG run,
the tasks for that DAG run are not automatically executed when doing a `clear`
for that DAG run. For DAG runs that have been triggered by its schedule, tasks
will automatically execute after a `clear` is done.
This is caused by the following query in the
`_find_executable_task_instances()` method (jobs.py, line 1056):
ti_query = (
session
.query(TI)
.filter(TI.dag_id.in_(simple_dag_bag.dag_ids))
.outerjoin(DR,
and_(DR.dag_id == TI.dag_id,
DR.execution_date == TI.execution_date))
.filter(or_(DR.run_id == None,
*not_(DR.run_id.like(BackfillJob.ID_PREFIX + '%'))*))
.outerjoin(DM, DM.dag_id==TI.dag_id)
.filter(or_(DM.dag_id == None,
not_(DM.is_paused)))
)
The bold line (`*not_(DR.run_id.like(BackfillJob.ID_PREFIX + '%'))`*) excludes
tasks that have a `run_id` that starts with `backfill_` and the `run_id` of the
Dag Run is still filled with `backfill_xxxxx` in the `dag_run` table.
I would expect that doing a `clear` on s specific DAG run would also clear the
`run_id` in the `dag_run` table for that DAG.
Is there a reason for this behavior? Why are DAGs not automatically executed
after a `clear` is done in the situation in which that DAG run has been
triggered by a backfill in the past?
Regards,
Gero
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)