alamb commented on issue #17719:
URL: https://github.com/apache/datafusion/issues/17719#issuecomment-3380847139

   > [@alamb](https://github.com/alamb) Do you have an intuition whether Join 
ordering should be done as a LogicaPlan optimization, PhysicalPlan optimization 
or during physical planning?
   
   I think the intuition is that the best join order is typically based mostly 
on estimated cardinality (you want to plan the most selective joins first) 
which is a function of predicates and join order, rather than  physical 
characteristics of the plan (e.g. the join algorithms used)
   
   That being said, I have definitely seen queries where a slightly less 
optimal join order is better for some reason (e.g. it keeps the data sorted so 
you can use a MergeJoin rather than a HashJoin), so I think there is room for 
discussion here
   
   Creating a JoinGraph structure for DataFusion's `ExecutionPlan` rather than 
`LogicalPlan` I think is definitely worth considering, especially since, as you 
say, the current APIs have much more information and cardinality estimation is 
currently done at the Physical level 🤔 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to