liurenjie1024 commented on issue #2633:
URL: 
https://github.com/apache/arrow-datafusion/issues/2633#issuecomment-1169784027

   Hi, @alamb @andygrove I've finished a simple poc and you can find the code 
here: 
https://github.com/liurenjie1024/rust-opt-framework/tree/main/src/datafusion_poc
   
   Here are the general ideas:
   
   1.  To adopt new heuristic optimizer,  we can wrap `HeuristicOptimizer`  as 
a optimizer rule, and it works as following:
   ```
   Datafusion Logical Plan -> Our Logical Plan -> HeuristicOptimizer -> Our 
Logical Plan -> Datafusion Logical Plan
   ```
   You can find an implementation here:
   
https://github.com/liurenjie1024/rust-opt-framework/blob/main/src/datafusion_poc/rule.rs
   
   2. To adopt new cascades style cost based optimizer, we can implement a new 
`QueryPlanner`, which works as following:
   ```
   Datafusion logical plan -> Our logical plan -> Cost based optimizer -> Our 
physical plan -> Datafusion physical plan
   ```
   You can find implementation here:
   
https://github.com/liurenjie1024/rust-opt-framework/blob/main/src/datafusion_poc/planner.rs
   
   3. For robust behavior of cbo without statistics, I prefer to use trivial 
cost model. For example, add penalty for operators like sort, nest loop join, 
etc. Currently I don't have implementation for this, but I think the optimizer 
framework is flexible enough and we can add them later.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to