The GitHub Actions job "Tests" on 
airflow.git/fix/dag-runs-orm-query-performance has failed.
Run started by GitHub user pierrejeambrun (triggered by pierrejeambrun).

Head commit for run:
0c2fd9cacb840a75674c2d9b6664074a1d4e7221 / LakshmiSravyaVedantham 
<[email protected]>
perf: use load_only() in eager_load_dag_run_for_validation to reduce data 
fetched

The get_dag_runs API endpoint was slow on large deployments because
eager_load_dag_run_for_validation() used selectinload on task_instances and
task_instances_histories without restricting which columns were fetched.
This caused SQLAlchemy to load all heavyweight columns (executor_config with
pickled data, hostname, rendered fields, etc.) for every task instance across
every DAG run in the result page — even though only dag_version_id is needed
to traverse the association proxy to DagVersion.

Add load_only(TaskInstance.dag_version_id) and
load_only(TaskInstanceHistory.dag_version_id) to the selectinload chains so
the SELECT for task instances fetches only the identity columns and the FK
needed to resolve the dag_version relationship, significantly reducing the
volume of data transferred from the database on busy deployments.

Fixes #62025

Report URL: https://github.com/apache/airflow/actions/runs/22728316605

With regards,
GitHub Actions via GitBox


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to