alamb commented on PR #13736:
URL: https://github.com/apache/datafusion/pull/13736#issuecomment-2544117875

   > Is having custom statistics something that DataFusion might support? For 
example, I could declare a custom statistic along with a custom optimizer rule 
that makes use of it. I can also see the opposite argument that if a statistic 
is in any way useful, then DataFusion should add support for it internally, and 
therefore it doesn't need extensible stats.
   
   Extending statistics to support user defined data seems very reasonable to 
me. I good test in my mind to avoid APIs that can't actually be used in the 
real world,  is to try and make some sort of example showing how someone would 
actually use it (e.g. maybe pass the custom statistics into a user defined 
function that can take advantage of it somehow?)
   
   > Do you consider this to be blocking for this PR? Or is expanding the size 
of ColumnStatistics acceptable in the short-term?
   
   I don't consider it blocking per se -- especially if we are (finally) going 
to get the project to revamp Statistics moving again
   
   I would like to get some consensus on what we want to do with Statistics / 
range / interval evaluation on statistics so that we don't end up with multiple 
incompatible partially overlapping features.
   
   Thank you again


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to