adriangb commented on PR #20037: URL: https://github.com/apache/datafusion/pull/20037#issuecomment-3807523660
> Will have a look, but still believe if it is needed distributed engine should implement it not in DataFusion. > > Could that be an option? It could be, but as per the PR title I think there is actually a real benefit for DataFusion as well. There are cases of serializing and deserializing plans that do not involve distributed engines, I don't personally use the Python APIs / FFI but I think those users also use protobuf serde. Basically at the moment dynamic filters (a built in feature of DataFusion) and protobuf serde (a built in feature of DataFusion) are incompatible. Further there is a strong argument to do this for all expressions e.g. to avoid duplicating the data inside of InList expressions, large literal strings, etc. *and* more faithfully roundtrip the expression graph/tree. I also don't see any harm to other users of DataFusion. This change is fully contained within the protobuf serde code, so it definitely has no impact on users that don't use that part of DataFusion. For users that do use that part the only breaking change is the extra field in the proto, and most users are not creating proto structs by hand (they are created by our functions). So I don't think there are really any breaking changes. There is also no performance impact (getting Arc pointers is a very cheap operation and I suspect - but have not run benchmarks to prove it - the cost of setting a O(num_expressions) u64 keys in a HashMap is going to be negligible compared to other serde overhead). This is my reasoning for implementing this in DataFusion. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
