Jorricks commented on a change in pull request #16554:
URL: https://github.com/apache/airflow/pull/16554#discussion_r656018821
##########
File path: airflow/serialization/serialized_objects.py
##########
@@ -186,13 +186,18 @@ def serialize_to_json(
if cls._is_excluded(value, key, object_to_serialize):
continue
- if key in decorated_fields:
- serialized_object[key] = cls._serialize(value)
- else:
- value = cls._serialize(value)
- if isinstance(value, dict) and "__type" in value:
+ value = cls._serialize(value)
+ if key not in decorated_fields and isinstance(value, dict) and
"__type" in value:
+ # Extra check to make sure for custom operators or dags that
the non-standard
+ # serialized_fields still support non-primitive types (e.g.
dicts).
+ if isinstance(object_to_serialize, DAG) and key in
DAG.get_serialized_fields():
+ value = value["__var"]
+ elif (
+ isinstance(object_to_serialize, BaseOperator)
+ and key in BaseOperator.get_serialized_fields()
+ ):
Review comment:
It's a bit of a pickle. I agree. Let me explain it with an example.
I copied this example from the test case.
```python
class MyOperator(BaseOperator):
__serialized_fields: Optional[frozenset] = None
def __init__(self, a_dictionary_value: Dict, **kwargs):
super().__init__(**kwargs)
self.a_dictionary_value = a_dictionary_value
@classmethod
def get_serialized_fields(cls):
if not cls.__serialized_fields:
custom_fields = {"a_dictionary_value"}
cls.__serialized_fields =
frozenset(super().get_serialized_fields() | custom_fields)
return cls.__serialized_fields
```
In the case when we serialize this operator, and the extra field
(`a_dictionary_value`) is a complex type, it will fail during deserialization
because it is missing the `__var` and `__type` keys that are not there anymore.
Now the proposal here, is that anything that is out-side the standard
functionality of the BaseOperator or DAG, is serialized in the full way with a
`__var` and `__type` dictionary.
Thereby, if we would call `object_to_serialize.get_serialized_fileds()` on
this object, `a_dictionary_value` would be seen as a 'standard' value (just a
dict without `__var` and `__type`) which would give an exception once we
deserialize.
Getting this to work was quite a hassle as there is so much custom logic. It
makes me wonder why this clause was added and why we are not always surrounding
complex types in a dict with `__var` and `__type` keys.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]