potiuk commented on issue #40974:
URL: https://github.com/apache/airflow/issues/40974#issuecomment-2250048420

   The problem as I understand it is tha we currently have two serializations:
   
   1) 
https://github.com/apache/airflow/blob/main/airflow/serialization/serialized_objects.py
 -> this one uses "old" serialization where you have giant if/else clause where 
we have to manually add new type of objects to serialize
   
   2) 
https://github.com/apache/airflow/blob/main/airflow/serialization/serde.py -> 
which is a pluggable and presumably way faster way of serializing. It uses 
"pluggable" 
https://github.com/apache/airflow/tree/main/airflow/serialization/serializers 
modules and I believe you can also add your own serializers and implement 
"serialize/deserialize" methods in your objects. It has no giant if, providers 
could potentnially register their own serializers etc. etc. 
   
   Some of our code uses 1) - some uses 2). There are a few problems there - 
for example 2) does not yet support DAG serialization, and it also is not used 
for example in internal-api (but internal-api will be gone in Airlfow 3).
   
   There are also other places where we use other serialization mechanisms 
(dill, cloudpickle) and there is a potential tha we could consolidate that as 
well. 
   
   Ideally (this was the long term vision of @bolkedebruin) we should have "one 
to rule them all" - 2) should become universal serializer used by everything.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to