adriangb commented on PR #15301:
URL: https://github.com/apache/datafusion/pull/15301#issuecomment-2753068116

   Noting a future optimization opportunity to be done after this work: push 
down pre-collected stats into ParquetSource / DataSourceExec so that dynamic 
filters can use them to prune without having to open the file at all. This is 
only beneficial if stats were collected during the planning phase (eg by 
ListingTableProvider or a secondary index) but did not result in pruning the 
file (because there was not an appropriate filter at the time) but later in a 
dynamically generated filter _can_ prune based on those stats, so we avoid 
reading or re-reading the Parquet metadata.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to