Desdroid commented on issue #57726: URL: https://github.com/apache/airflow/issues/57726#issuecomment-3523026651
> Documentation says (implemented as in public doc): serializers -> serialize() method -> [@DataClass](https://github.com/DataClass) or @attr.define > > > Airflow out of the box supports three ways of custom serialization. Primitives are returned as is, without additional encoding, e.g. a str remains a str. When it is not a primitive (or iterable thereof) Airflow looks for a registered serializer and deserializer in the namespace of airflow.serialization.serializers. If not found it will look in the class for a serialize() method or in case of deserialization a deserialize(data, version: int) method. Finally, if the class is either decorated with [@DataClass](https://github.com/DataClass) or @attr.define it will use the public methods for those decorators. > > Code docs: serialize() method -> serializers -> [@DataClass](https://github.com/DataClass) or @attr.define > > > Values that are not of a built-in type are serialized if a serializer is > > found for them. The order in which serializers are used is > > 1. A `serialize` function provided by the object. > > 2. A registered serializer in the namespace of `airflow.serialization.serializers` > > 3. Annotations from attr or dataclass. > > As for me, it would be nice to have a way to override any serialization logic by custom user serializers. That’s probably what I would expect. I think there were reasons to do so. In fact the code doc is currently wrong and it is implemented as stated in the docs. I tried to change that in #56881 but it seemed to break some provider. I did not have the time yet to figure out what the issue is. I also changed the serialization to use `mode=json` in #56878 as for pydantic models this ensures you don't have any non-json serializable fields anymore in the returned dict (such as datetimes etc.). **In general I think this is rather a usage error.** You try to use a pydantic model with an arbitrary type and therefore need to set `model_config = {"arbitrary_types_allowed": True}` This means pydantic fails to create a proper pydantic schema. I think if you use pydantic models and want them to work properly with json serialization you should only use types that pydantic supports. If you use classes as pydantic fields these either need to be types that pydantic supports or pydantic models / dataclasses themselves so that pydantic can create a schema. If airflow wants to support their custom classes in pydantic models we could think about either inheriting from BaseModel or defining `__get_pydantic_core_schema__` on the classes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
