SamWheating opened a new issue #18023:
URL: https://github.com/apache/airflow/issues/18023


   ### Apache Airflow version
   
   2.1.3 (latest released)
   
   ### Operating System
   
   Debian GNU/Linux 10 (buster)
   
   ### Versions of Apache Airflow Providers
   
   _No response_
   
   ### Deployment
   
   Other 3rd-party Helm chart
   
   ### Deployment details
   
   _No response_
   
   ### What happened
   
   When a DAG is created with a start date far in the past and a relatively 
small schedule interval, the scheduler will create _a lot_ of queued DagRuns. 
   
   This can potentially impact the time to move a DagRun from `queued` to 
`running`, as the `_start_queued_dagruns` function has to iterate through the 
entire set of `queued` DagRuns (looking at `max_dagruns_per_loop_to_schedule` 
at a time) before it re-examines a given DagRun. 
   
   It appears that this is intended behaviour, but I'm wondering if we should 
be limiting the number of queued DagRuns per DAG in order to reduce strain on 
the scheduler. 
   
   ### What you expected to happen
   
   _No response_
   
   ### How to reproduce
   
   This DAG will create ~2000 queued DagRuns when it is loaded:
   
   ```python
   import time
   from datetime import timedelta
   
   from airflow import models
   from airflow import utils
   from airflow.operators import python_operator
   
   dag = models.DAG(
       'data-infrastructure-examples.backfill-example-8',
       start_date=utils.timezone.utcnow() - timedelta(days=1),
       max_active_runs=1,
       dagrun_timeout=timedelta(minutes=10),
       schedule_interval="*/5 * * * *",
       catchup=True,
       concurrency=1
   )
   
   def sleep_five_minutes():
       print("sleeping 5m")
       time.sleep(300)
       print("Done")
   
   child_task = python_operator.PythonOperator(
       task_id='sleep_5_minutes',
       dag=dag,
       python_callable=sleep_five_minutes)
   ```
   
   ### Anything else
   
   _No response_
   
   ### Are you willing to submit PR?
   
   - [ ] Yes I am willing to submit a PR!
   
   ### Code of Conduct
   
   - [X] I agree to follow this project's [Code of 
Conduct](https://github.com/apache/airflow/blob/main/CODE_OF_CONDUCT.md)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to