Desdroid commented on issue #57726:
URL: https://github.com/apache/airflow/issues/57726#issuecomment-3523026651

   > Documentation says (implemented as in public doc): serializers -> 
serialize() method -> [@DataClass](https://github.com/DataClass) or @attr.define
   > 
   > > Airflow out of the box supports three ways of custom serialization. 
Primitives are returned as is, without additional encoding, e.g. a str remains 
a str. When it is not a primitive (or iterable thereof) Airflow looks for a 
registered serializer and deserializer in the namespace of 
airflow.serialization.serializers. If not found it will look in the class for a 
serialize() method or in case of deserialization a deserialize(data, version: 
int) method. Finally, if the class is either decorated with 
[@DataClass](https://github.com/DataClass) or @attr.define it will use the 
public methods for those decorators.
   > 
   > Code docs: serialize() method -> serializers -> 
[@DataClass](https://github.com/DataClass) or @attr.define
   > 
   > > Values that are not of a built-in type are serialized if a serializer is
   > > found for them. The order in which serializers are used is
   > > 1. A `serialize` function provided by the object.
   > > 2. A registered serializer in the namespace of 
`airflow.serialization.serializers`
   > > 3. Annotations from attr or dataclass.
   > 
   > As for me, it would be nice to have a way to override any serialization 
logic by custom user serializers. That’s probably what I would expect. I think 
there were reasons to do so.
   
   In fact the code doc is currently wrong and it is implemented as stated in 
the docs. 
   I tried to change that in #56881 but it seemed to break some provider. I did 
not have the time yet to figure out what the issue is. 
   
   I also changed the serialization to use `mode=json` in #56878 as for 
pydantic models this ensures you don't have any non-json serializable fields 
anymore in the returned dict (such as datetimes etc.).
   
   **In general I think this is rather a usage error.** 
   You try to use a pydantic model with an arbitrary type and therefore need to 
set `model_config = {"arbitrary_types_allowed": True}` This means pydantic 
fails to create a proper pydantic schema.
   I think if you use pydantic models and want them to work properly with json 
serialization you should only use types that pydantic supports. 
   If you use classes as pydantic fields these either need to be types that 
pydantic supports or pydantic models / dataclasses themselves so that pydantic 
can create a schema.
   
   If airflow wants to support their custom classes in pydantic models we could 
think about either inheriting from BaseModel or defining 
`__get_pydantic_core_schema__` on the classes. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to