adriangb commented on PR #20037:
URL: https://github.com/apache/datafusion/pull/20037#issuecomment-3807523660

   > Will have a look, but still believe if it is needed distributed engine 
should implement it not in DataFusion.
   > 
   > Could that be an option?
   
   It could be, but as per the PR title I think there is actually a real 
benefit for DataFusion as well.
   
   There are cases of serializing and deserializing plans that do not involve 
distributed engines, I don't personally use the Python APIs / FFI but I think 
those users also use protobuf serde.
   
   Basically at the moment dynamic filters (a built in feature of DataFusion) 
and protobuf serde (a built in feature of DataFusion) are incompatible. Further 
there is a strong argument to do this for all expressions e.g. to avoid 
duplicating the data inside of InList expressions, large literal strings, etc. 
*and* more faithfully roundtrip the expression graph/tree.
   
   I also don't see any harm to other users of DataFusion. This change is fully 
contained within the protobuf serde code, so it definitely has no impact on 
users that don't use that part of DataFusion. For users that do use that part 
the only breaking change is the extra field in the proto, and most users are 
not creating proto structs by hand (they are created by our functions). So I 
don't think there are really any breaking changes. There is also no performance 
impact (getting Arc pointers is a very cheap operation and I suspect - but have 
not run benchmarks to prove it - the cost of setting a O(num_expressions) u64 
keys in a HashMap is going to be negligible compared to other serde overhead).
   
   This is my reasoning for implementing this in DataFusion.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to