The GitHub Actions job "Tests" on airflow.git/fix/scheduler-performance-with-completed-dagruns has failed. Run started by GitHub user Arunodoy18 (triggered by Arunodoy18).
Head commit for run: 74678c06b5bde43010662fb2d94c11f8b2ee2521 / Arunodoy18 <[email protected]> Fix scheduler slowdown with large numbers of completed dag runs The scheduler was experiencing significant performance degradation when there were many completed dag runs (100k+) and task instances (3M+). Each scheduler loop was taking 15+ seconds instead of the normal ~1s. Root cause: - DagRun.get_running_dag_runs_to_examine() was eagerly loading ALL task instances for ALL running dag runs using joinedload() - This created massive joins with millions of rows even though only unfinished task instances were actually needed - The eager loading was only used in one code path (_verify_integrity_if_dag_changed) and only for unfinished TIs Solution: 1. Remove the joinedload(cls.task_instances) from the query to avoid loading task instances upfront 2. Explicitly query only unfinished task instances when they're needed in _verify_integrity_if_dag_changed This change significantly improves scheduler loop performance when there are many completed dag runs and task instances, bringing the loop time back to normal levels (~1s) without requiring frequent db clean operations. Fixes #54283 Report URL: https://github.com/apache/airflow/actions/runs/20724141289 With regards, GitHub Actions via GitBox --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
