SamWheating commented on issue #18023:
URL: https://github.com/apache/airflow/issues/18023#issuecomment-914490044


   After further investigation, I think that most of the performance issues we 
saw were due to the issue now resolved by 
https://github.com/apache/airflow/pull/17945.
   
   If needed, I can run some experiments and test the affects of having 
thousands of `queued` DagRuns on scheduler latency, but I think it's not as 
severe as I once thought. 
   
   However, I still think that its a good idea to implement a limit for queued 
DagRuns, as creating thousands of queued runs in advance (specifically in the 
case of DAGs with a much earlier start date and `catchup=True`) can lead to 
some weird behaviour:
   
   1) If the `end_date` is changed to an earlier date, there may already be 
queued runs after that date which have already been created.
   2) If the schedule interval is changed, the change will not affect already 
created queued Dagruns. 
   3) If there's a lifecycle policy on data in the DB, creating a DAGRun 
potentially weeks before it actually runs could cause the queued run to be 
dropped before it is ever run. 
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to