potiuk commented on code in PR #66491:
URL: https://github.com/apache/airflow/pull/66491#discussion_r3215225763


##########
airflow-core/src/airflow/cli/commands/dag_command.py:
##########
@@ -740,4 +741,6 @@ def dag_reserialize(args, session: Session = NEW_SESSION) 
-> None:
             continue
         bundle.initialize()
         dag_bag = BundleDagBag(bundle.path, bundle_path=bundle.path, 
bundle_name=bundle.name)
-        sync_bag_to_db(dag_bag, bundle.name, 
bundle_version=bundle.get_current_version(), session=session)
+        result = bundle.get_current_version()
+        version = result.version if isinstance(result, BundleVersion) else 
result
+        sync_bag_to_db(dag_bag, bundle.name, bundle_version=version, 
session=session)

Review Comment:
   *Flagged by the adversarial reviewer (Codex). Cross-checked — agree with the 
finding; this is a sharper version of my earlier finding #1.*
   
   **[medium] `dags reserialize` drops `BundleVersion.data`**
   
   > `dag_reserialize` unwraps `BundleVersion` to only `result.version` before 
calling `sync_bag_to_db`, and `sync_bag_to_db` still has no `version_data` 
parameter. Any bundle that depends on structured version metadata, such as a 
manifest, loses that metadata when Dags are reserialized through the CLI. The 
persisted `DagVersion` will have the right `bundle_version` but 
`version_data=NULL`, which can break later consumers that need the manifest to 
reconstruct the exact bundle contents. **This is a silent data loss path, not 
just a display issue.**
   >
   > **Recommendation:** Propagate both pieces from `BundleVersion`: add 
`version_data` to `sync_bag_to_db`, pass it through to 
`update_dag_parsing_results_in_db`, and add a CLI reserialize test asserting 
`DagVersion.version_data` is stored.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to