o-nikolas commented on code in PR #67217:
URL: https://github.com/apache/airflow/pull/67217#discussion_r3300275524
##########
airflow-core/src/airflow/executors/workloads/task.py:
##########
@@ -118,9 +118,13 @@ def make(
ser_ti = TaskInstanceDTO.model_validate(ti, from_attributes=True)
if not bundle_info:
+ version_data = None
+ if ti.dag_version is not None:
+ version_data = ti.dag_version.version_data
Review Comment:
@ashb - What we're passing along here is the version that is being used and
the bundle to use to grab it. The bundle itself isn't tracking the current
version of the dag, that's on the dag_version. So once on the worker side,
we'll ask the bundle to load the state of the bundle for the specific version
the dag is trying to run with. So properties from these two things are combined
later. Does that make sense?
@kaxil Fair point, I think the right rule is: version_data should only be
populated when the run is pinned (i.e., dag_run.bundle_version is not None). If
the run is unpinned, the worker should use the latest bundle state anyway, so
sending stale version_data would be misleading. I'll add a guard:
```python
version_data = None
if ti.dag_version is not None and ti.dag_run.bundle_version is not None:
version_data = ti.dag_version.version_data
```
This mirrors the existing rule for bundle_version at
scheduler_job_runner.py:1438-1442.
For the second case the `version_data` will still come from the ti's
`dag_version` (which may be newer than the run's), but since the run is pinned,
the bundle will use `version_data` to check out the right code regardless. If
we want stricter alignment (version_data always matches bundle_version), we'd
need to look up the `DagVersion` by `bundle_version` string rather than using
`ti.dag_version` but that adds another query and I think the current ti-based
approach is correct for the worker's needs. Thoughts?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]