imback82 commented on issue #25015: [SPARK-28217][SQL] Allow a pluggable 
statistics plan visitor for a logical plan.
URL: https://github.com/apache/spark/pull/25015#issuecomment-507980634
 
 
   > What's the use case here? How does one use this without having fields to 
store stats?
   
   Today, cost/stats calculation in Catalyst is hard-coded and difficult to 
extend/customize (i.e., it only supports "size in bytes" and "basic stats" plan 
visitor). Cost/stats estimation/calculation has been known as a hard problem 
for decades, and people have been trying numerous approaches in both literature 
and practice. Indeed, some of our own customers have requested flexibility that 
allows them to plug-in their own cost/stats calculation mechanisms. This PR 
provides an extension point where a user can plug in a custom statistics plan 
visitor which can estimate/calculate stats/costs differently from the built-in 
ones, without of course, disrupting the existing use cases.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to