Abhiii47 opened a new pull request, #62176:
URL: https://github.com/apache/airflow/pull/62176

   ### Description
   
   This PR optimizes `DagRun.get_task_instances` by eliminating a redundant 
`joinedload(TI.dag_run)` database query.
   
   **The Problem:**
   Currently, when fetching task instances through a `DagRun` instance, the 
query defaults to a `joinedload` on the `dag_run` relationship. Since 
`get_task_instances` is an instance method, we already have the `DagRun` object 
(`self`) in memory. This join results in fetching the same data we already 
possess for every single task instance row returned, causing unnecessary 
database load and memory bloat.
   
   **The Solution:**
   - **Refactoring:** Extracted the core logic of `fetch_task_instances` into a 
new internal method `_fetch_task_instances` located in 
`airflow/models/dagrun.py`.
   - **Optimization Toggle:** Added a `load_dag_run: bool` parameter. When 
`False`, it skips the `joinedload` option in the SQLAlchemy query.
   - **Manual Backfilling:** In the `get_task_instances` method, we now call 
the internal fetch with `load_dag_run=False` and manually set `ti.dag_run = 
self` for each returned task instance. This preserves object identity and 
prevents future lazy-loading without requiring a join.
   
   
   
   **Changes:**
   - Modified `airflow/models/dagrun.py`: Refactored `fetch_task_instances` and 
optimized `get_task_instances`.
   - Added `tests/unit/models/test_dagrun_optimization.py`: A new suite of 
mock-based tests to verify that the SQL query excludes the join when the 
optimization is active.
   
   ### Related Issue
   closes: #62027
   
   ---
   
   ##### Was generative AI tooling used to co-author this PR?
   
   - [x] Yes (Gemini)
   
   Generated-by: Gemini following [the 
guidelines](https://github.com/apache/airflow/blob/main/contributing-docs/05_pull_requests.rst#gen-ai-assisted-contributions)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to