isidentical opened a new issue, #3929:
URL: https://github.com/apache/arrow-datafusion/issues/3929

   This is a meta issue for improving cost calculations and cost-based 
optimizations in DataFusion. We already have some statistics collected (mainly 
from the table sources) and there are estimations for statistics by some of the 
execution plan nodes, and the overall idea is to improve these as well as 
possible CBOs.
   
   ### Main Goals
   - Have enough statistics to start nested join optimizations (#3843). This 
involves being able to guess the weight of a join side, and do global 
re-ordering between join sides to minimize the overall cost of parent joins by 
reducing the output as much as possible at the bottom levels.
   - Provide a more reliable static analysis phase for physical execution 
operators (so that range based pruning/predicate pruning can leverage the 
existing infrastructure on their implementations)
   - What else?
   
   ### Work in Progress
   
   - [ ] https://github.com/apache/arrow-datafusion/issues/3898
   - [ ] https://github.com/apache/arrow-datafusion/issues/3845
   - What else?
   
   ### Planned
   - [ ] Estimating join cardinalities when the underlying table does not have 
any statistics 
(https://github.com/apache/arrow-datafusion/issues/3813#issuecomment-1276643214).
   - What else?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to