alamb commented on PR #20188: URL: https://github.com/apache/datafusion/pull/20188#issuecomment-3861451365
> I do wonder if it would be okay to say the statistics are coupled to the scan plan -> if we know some row groups will not be read and we can use that information to make more accurate statistics we should / can. > > One 🎣 for another day: how do struct statistics fit into our stats framework? One thought I had was to use some sort of delayed statistics thing -- like have a callback to produce statistics and only compute them on demand when they are actually used. Otherwise figuring out what stats will be used is going to be a very tricky business. But maybe on demand would also be tricky -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
