o-nikolas commented on code in PR #67217:
URL: https://github.com/apache/airflow/pull/67217#discussion_r3300275524


##########
airflow-core/src/airflow/executors/workloads/task.py:
##########
@@ -118,9 +118,13 @@ def make(
 
         ser_ti = TaskInstanceDTO.model_validate(ti, from_attributes=True)
         if not bundle_info:
+            version_data = None
+            if ti.dag_version is not None:
+                version_data = ti.dag_version.version_data

Review Comment:
   @ashb - What we're passing along here is the version that is being used and 
the bundle to use to grab it. The bundle itself isn't tracking the current 
version of the dag, that's on the dag_version. So once on the worker side, 
we'll ask the bundle to load the state of the bundle for the specific version 
the dag is trying to run with. So properties from these two things are combined 
later. Does that make sense?
   
   @kaxil Fair point, I think the right rule is: version_data should only be 
populated when the run is pinned (i.e., dag_run.bundle_version is not None). If 
the run is unpinned, the worker should use the latest bundle state anyway, so 
sending stale version_data would be misleading. I'll add a guard:
   
   ```python  
   version_data = None
   if ti.dag_version is not None and ti.dag_run.bundle_version is not None:
       version_data = ti.dag_version.version_data 
   ```
   This mirrors the existing rule for bundle_version at 
scheduler_job_runner.py:1438-1442.
   
   For the second case the `version_data` will still come from the ti's 
`dag_version` (which may be newer than the run's), but since the run is pinned, 
the bundle will use `version_data` to check out the right code regardless. If 
we want stricter alignment (version_data always matches bundle_version), we'd 
need to look up the `DagVersion` by `bundle_version` string rather than using 
`ti.dag_version` but that adds another query and I think the current ti-based 
approach is correct for the worker's needs. Thoughts?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to