kaxil commented on code in PR #64626:
URL: https://github.com/apache/airflow/pull/64626#discussion_r3028665314
##########
airflow-core/src/airflow/serialization/encoders.py:
##########
@@ -162,6 +162,14 @@ def _ensure_serialized(d):
if isinstance(trigger, dict):
classpath = trigger["classpath"]
kwargs = trigger["kwargs"]
+ # unwrap any kwargs that are themselves serialized objects, to avoid
double-serialization in the trigger's own serialize() method.
+ unwrapped = {}
+ for k, v in kwargs.items():
+ if isinstance(v, dict) and Encoding.TYPE in v:
+ unwrapped[k] = BaseSerialization.deserialize(v)
Review Comment:
This unwrap step is a no-op. After unwrapping, `_ensure_serialized` on line
169 re-serializes each value right back to the same form. Walk through it with
a tuple kwarg `(1, 2)` stored as `{"__type": "tuple", "__var": [1, 2]}`: before
the PR, `_ensure_serialized` sees `Encoding.TYPE` in the dict and returns it
as-is; after the PR, unwrap deserializes to `(1, 2)`, then `_ensure_serialized`
calls `BaseSerialization.serialize((1,2))` producing `{"__type": "tuple",
"__var": [1, 2]}`. Same output either way.
The root cause of the hash mismatch in `add_asset_trigger_references` is
that `BaseEventTrigger.hash()` internally calls
`BaseSerialization.serialize(kwargs)` again on kwargs that are already
serialized by `encode_trigger`. The DB side doesn't double-wrap because
`_decrypt_kwargs` returns raw Python objects. The fix needs to happen where the
hash is computed (as PR #64625 does), not inside `encode_trigger`.
##########
airflow-core/src/airflow/serialization/encoders.py:
##########
@@ -162,6 +162,14 @@ def _ensure_serialized(d):
if isinstance(trigger, dict):
classpath = trigger["classpath"]
kwargs = trigger["kwargs"]
+ # unwrap any kwargs that are themselves serialized objects, to avoid
double-serialization in the trigger's own serialize() method.
+ unwrapped = {}
+ for k, v in kwargs.items():
+ if isinstance(v, dict) and Encoding.TYPE in v:
+ unwrapped[k] = BaseSerialization.deserialize(v)
+ else:
+ unwrapped[k] = v
+ kwargs = unwrapped
else:
classpath, kwargs = trigger.serialize()
return {
Review Comment:
No tests for this change. A serialization fix to the asset watcher trigger
persistence path should include a test that verifies hash consistency between
the DAG-parsed and DB-stored paths -- something like: create a trigger with
non-primitive kwargs (tuple, set), round-trip through
`encrypt_kwargs`/`_decrypt_kwargs`, and verify `BaseEventTrigger.hash` produces
the same value for both the original and round-tripped kwargs.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]