gyli commented on issue #40974: URL: https://github.com/apache/airflow/issues/40974#issuecomment-2261882872
@bolkedebruin Still thinking about the best approach here. The tricky part is how to handle different encoding if old code is used indeed. While even if we decide not to use the old codes, we will still need to implement classes similar to `SerializedBaseOperator`, `SerializedDAG`, `DependencyDetector`, etc. in serde, with mainly the `serialize` and `deserialize` methods replaced with serde methods. Approach 1, calling `serialized_objects` from `serde`, and keeping the old output format for DAG and operator. This approach requires minimum changes, and existing jsonschema can be reused directly. While the cost is, both encoding format will be used. Approach 2, porting `SerializedBaseOperator`, `SerializedDAG`, `DependencyDetector` classes into serde, and replacing `serialize` and `deserialize` methods with serde's. A new jsonschema file is needed as the encoding will be changed. Another reason to port those classes instead of creating a new serde serializer module is, dag serialization is recursive. This approach will make `serde.py` codes much longer, probably similar to the length of current `serialized_objects.py` eventually. I gave a second thought and second approach makes more sense to me. Although it requires more work, we might want a complete migration from old to new serializer considering this is an Airflow 2 to 3 change. Please let me know what you think. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
