sjyangkevin commented on PR #51059: URL: https://github.com/apache/airflow/pull/51059#issuecomment-2977023775
Hi @potiuk , very appreciate the insights and I would like to share some thoughts. Feel free to correct me if I am wrong on anything below. We had a discussion in #50867 , and the issue with serializing Pydantic model raised in Cohere provider. Considering pydantic class may potentially be used by other providers, we think that it can be good to have it implemented in the core module such that it can be generic and reusable. In the current serialization module, I feel pandas, numpy, datetime, are similar to this case, which are common objects maybe used by multiple providers, or by tasks to pass in XComs. This approach may help avoid implementing similar things in different providers. serialization come from providers can also provide multiple benefits. 1.) we do not need a core release when updates are needed for serialization/deserialization for data created from a specific providers (iceberg should from iceberg provider, etc.) 2.) core can be minimal to just discover and register serde as extensions I am also very interested in looking into the option of how we can move it out of core and let provider managers to reuse common objects and register those as needed. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
