Re: [PR] Implement semi/anti join output statistics estimation [arrow-datafusion]

via GitHub Tue, 02 Apr 2024 11:44:28 -0700


alamb commented on PR #9800:
URL: 
https://github.com/apache/arrow-datafusion/pull/9800#issuecomment-2032797206


   🤔  In my mind the way a cost based optimizer (CBO) typically works is that 
there are:
   1. A set of heuristics that take things like partitioning/ordering into 
account to create potential plans (with potentially different join orders)
   2. A cost model that is then used to pick between the potential plans
   
   I was thinking if we could decouple the "make some potential plans" and 
"what would it cost to run this query" parts, we could let people implement 
their own cost based optimizer (and we could pull the basic cardinality 
estimation code into the "build in" cost model)
   
   I don't have time to pursue the idea much now


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] Implement semi/anti join output statistics estimation [arrow-datafusion]

Reply via email to