timsaucer commented on issue #1612:
URL: 
https://github.com/apache/datafusion-python/issues/1612#issuecomment-4843678368

   Thanks for the PR!
   
   The main issue I see with this is that it makes datafusion-distributed a new 
dependency for datafusion-python. That's going to add bloat to the existing 
large wheels we're already producing. Also if we want to support ballista in 
the same way then we we're adding yet another external dependency and trying to 
ship/support them in this main repo.
   
   The big advantage of this PR is how small / easy it is.
   
   The longer term version I had in mind was that we expose via FFI the 
physical optimizer (done) and query planner (in progress). Then you have a 
relatively thin `datafusion-distributed` python package and when you create a 
session context you simply add in the new query planner or physical optimizer. 
This would work the same for both ballista, datafusion-distributed, and any 
other package that comes along and wants to do something similar.
   
   What do you think?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to