Abhiii47 opened a new pull request, #62176: URL: https://github.com/apache/airflow/pull/62176
### Description This PR optimizes `DagRun.get_task_instances` by eliminating a redundant `joinedload(TI.dag_run)` database query. **The Problem:** Currently, when fetching task instances through a `DagRun` instance, the query defaults to a `joinedload` on the `dag_run` relationship. Since `get_task_instances` is an instance method, we already have the `DagRun` object (`self`) in memory. This join results in fetching the same data we already possess for every single task instance row returned, causing unnecessary database load and memory bloat. **The Solution:** - **Refactoring:** Extracted the core logic of `fetch_task_instances` into a new internal method `_fetch_task_instances` located in `airflow/models/dagrun.py`. - **Optimization Toggle:** Added a `load_dag_run: bool` parameter. When `False`, it skips the `joinedload` option in the SQLAlchemy query. - **Manual Backfilling:** In the `get_task_instances` method, we now call the internal fetch with `load_dag_run=False` and manually set `ti.dag_run = self` for each returned task instance. This preserves object identity and prevents future lazy-loading without requiring a join. **Changes:** - Modified `airflow/models/dagrun.py`: Refactored `fetch_task_instances` and optimized `get_task_instances`. - Added `tests/unit/models/test_dagrun_optimization.py`: A new suite of mock-based tests to verify that the SQL query excludes the join when the optimization is active. ### Related Issue closes: #62027 --- ##### Was generative AI tooling used to co-author this PR? - [x] Yes (Gemini) Generated-by: Gemini following [the guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
