Re: [PR] Better document the relationship between `FileFormat::projection` / `FileFormat::filter` and `FileScanConfig::Statistics` [datafusion]

via GitHub Fri, 06 Feb 2026 08:45:24 -0800


alamb commented on PR #20188:
URL: https://github.com/apache/datafusion/pull/20188#issuecomment-3861451365


   > I do wonder if it would be okay to say the statistics are coupled to the 
scan plan -> if we know some row groups will not be read and we can use that 
information to make more accurate statistics we should / can.
   > 
   > One 🎣 for another day: how do struct statistics fit into our stats 
framework?
   
   One thought I had was to use some sort of delayed statistics thing -- like 
have a callback to produce statistics and only compute them on demand when they 
are actually used. Otherwise figuring out what stats will be used is going to 
be a very tricky business. But maybe on demand would also be tricky


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] Better document the relationship between `FileFormat::projection` / `FileFormat::filter` and `FileScanConfig::Statistics` [datafusion]

Reply via email to