alamb commented on PR #13736: URL: https://github.com/apache/datafusion/pull/13736#issuecomment-2544117875
> Is having custom statistics something that DataFusion might support? For example, I could declare a custom statistic along with a custom optimizer rule that makes use of it. I can also see the opposite argument that if a statistic is in any way useful, then DataFusion should add support for it internally, and therefore it doesn't need extensible stats. Extending statistics to support user defined data seems very reasonable to me. I good test in my mind to avoid APIs that can't actually be used in the real world, is to try and make some sort of example showing how someone would actually use it (e.g. maybe pass the custom statistics into a user defined function that can take advantage of it somehow?) > Do you consider this to be blocking for this PR? Or is expanding the size of ColumnStatistics acceptable in the short-term? I don't consider it blocking per se -- especially if we are (finally) going to get the project to revamp Statistics moving again I would like to get some consensus on what we want to do with Statistics / range / interval evaluation on statistics so that we don't end up with multiple incompatible partially overlapping features. Thank you again -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
