The GitHub Actions job "Tests AMD" on airflow.git/fix-detached-bug has failed.
Run started by GitHub user kaxil (triggered by kaxil).

Head commit for run:
9681561d965af375a1bc9a0d67eacbcc871d0f03 / Kaxil Naik <[email protected]>
Fix scheduler heartbeat timeout failures with ``DetachedInstanceError``

Resolves `DetachedInstanceError` when scheduler processes task instances that 
have
timed out during heartbeat detection. The error occurred when Pydantic 
validation
of `TIRunContext` attempted to access the consumed_asset_events relationship on
`DagRun` objects that had been detached from the `SQLAlchemy` session.

Root cause: The main scheduler loop calls `session.expunge_all()` which detaches
all objects from the session. Later, when processing heartbeat timeouts, the
scheduler creates `TIRunContext` objects that trigger Pydantic validation of
`dag_run.consumed_asset_events`, causing `DetachedInstanceError` on the 
lazy-loaded
relationship.

Solution: Add `selectinload(DagRun.consumed_asset_events)` to the heartbeat 
timeout
query to eagerly load the relationship before objects are detached. This minimal
fix loads only the required relationship without over-eager loading of nested
fields that aren't accessed during heartbeat processing.

The fix affects all DAG types since consumed_asset_events is initialized as an
empty list on all DagRun objects, not just asset-triggered DAGs.

Longer term using `back_populates` (with `lazy="selectin"`) might be better so 
we don't need to remember this:
https://docs.sqlalchemy.org/en/20/orm/queryguide/relationships.html
https://docs.sqlalchemy.org/en/20/orm/relationship_api.html#sqlalchemy.orm.relationship.params.back_populates

Report URL: https://github.com/apache/airflow/actions/runs/16582527581

With regards,
GitHub Actions via GitBox


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to