LakshmiSravyaVedantham opened a new pull request, #62482:
URL: https://github.com/apache/airflow/pull/62482
## Summary
Closes #62025
The `get_dag_runs` REST API endpoint experiences slow response times on
large Airflow deployments.
## Root Cause
`eager_load_dag_run_for_validation()` used `selectinload` on
`DagRun.task_instances` and `DagRun.task_instances_histories` without
restricting which columns were fetched. This caused SQLAlchemy to load **all**
heavyweight `TaskInstance` columns — including `executor_config` (a pickled
binary blob), `hostname`, `rendered_map_index`, `context_carrier`, and many
others — for every task instance across every DAG run in the result page.
In practice, only `dag_version_id` (the foreign key) is needed to traverse
the `TaskInstance.dag_version` association proxy used by the
`DagRun.dag_versions` property, which drives `DAGRunResponse.dag_versions`
serialization.
## Fix
Add `load_only(TaskInstance.dag_version_id)` and
`load_only(TaskInstanceHistory.dag_version_id)` to the respective
`selectinload` chains. SQLAlchemy will then issue a `SELECT` for task instances
that fetches only the identity columns and `dag_version_id`, rather than the
full row. The subsequent
`.joinedload(TaskInstance.dag_version).joinedload(DagVersion.bundle)` chain
continues to work correctly.
```python
# Before
selectinload(DagRun.task_instances)
.joinedload(TaskInstance.dag_version)
.joinedload(DagVersion.bundle),
# After
selectinload(DagRun.task_instances)
.load_only(TaskInstance.dag_version_id)
.joinedload(TaskInstance.dag_version)
.joinedload(DagVersion.bundle),
```
The same pattern is applied to `task_instances_histories`.
## Impact
On deployments with many task instances per DAG run, this change
significantly reduces:
- The volume of data transferred from the database per request
- Memory pressure from deserializing large pickled `executor_config` values
- Response latency for `GET /dags/{dag_id}/dagRuns` and the batch `POST
/dags/~/dagRuns/list` endpoints
## Testing
- Existing tests in
`tests/unit/api_fastapi/core_api/routes/public/test_dag_run.py` continue to pass
- The `dag_versions` property continues to work correctly since
`load_only()` restricts deferred columns, not the relationship traversal
## References
- [SQLAlchemy `load_only`
docs](https://docs.sqlalchemy.org/en/20/orm/queryguide/columns.html#column-deferral-using-load-only)
- [SQLAlchemy loading
strategies](https://docs.sqlalchemy.org/en/20/orm/queryguide/relationships.html)
- Fixes: #62025
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]