[
https://issues.apache.org/jira/browse/AIRFLOW-1056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17103621#comment-17103621
]
Kaxil Naik commented on AIRFLOW-1056:
-------------------------------------
PR to fix this: https://github.com/apache/airflow/pull/8776
Closing the issue as https://issues.apache.org/jira/browse/AIRFLOW-1156
describes the same issue
> Single dag run triggered when un-pausing job with catchup=False
> ---------------------------------------------------------------
>
> Key: AIRFLOW-1056
> URL: https://issues.apache.org/jira/browse/AIRFLOW-1056
> Project: Apache Airflow
> Issue Type: Bug
> Components: scheduler
> Affects Versions: 1.8.0
> Reporter: Andrew Heuermann
> Priority: Major
>
> When "catchup=False" a single job run is still triggered when un-pausing a
> dag when there are missed run windows.
> In airflow/jobs.py:create_dag_run(): When catchup is disabled it updates the
> dag.start_date here to prevent the backfill:
> https://github.com/apache/incubator-airflow/blob/bb39078a35cf2bceea58d7831d7a2028c8ef849f/airflow/jobs.py#L770.
> But it looks like the function schedules dags based on a window (using
> sequential run times as lower and upper bounds) so it will always schedule a
> single dag run if there is a missed run between the last run and the time
> which it was unpaused. Even if it was un-paused AFTER those missed runs.
> Some ideas on solutions:
> * Pass in the time when the scheduler last ran and use that as the lower
> bound of the window, but not sure how easy that is to get to.
> * Update the start_date when a dag with catchup=False is unpaused. Or add a
> new "unpaused_date" field that would serve the same purpose.
> * If paused have the scheduler insert a skipped Job record when the job would
> have run.
> There might be a simpler solution I'm missing.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)