korowa commented on PR #9800:
URL: 
https://github.com/apache/arrow-datafusion/pull/9800#issuecomment-2028848371

   > What do you think about trying to extract the cost models (e.g. 
cardinality estimation) into some API?
   
   @alamb, I don't have any strong opinion here (probably I'm lacking knowledge 
of usecases for this), and if I got the idea right -- on one side it might help 
by adding versatility to DF usage (there is already available, AFAIU, an option 
to customize physical optimizer, and this API should allow to reuse optimizer 
with custom cost model), but on the other side, if end goal is an 
(extensible/customizable) API, providing data required for physical plan 
optimization, I'm not sure that statistics estimation API will be enough, as 
there are more attributes affecting phyiscal plan significantly (e.g. 
partitioning and ordering related attributes), and as a result, to provide all 
required inputs random external planner needs, we may end up with +- same 
`ExecutionPlan`.
   
   Maybe it'll be better to start with internal estimator API (maybe not "API", 
but just set of functions, like we have now across multiple various 
utility-files, but better organized), and, for now, provide statistics through 
operators (as it's working right now) using this utility functional?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to