alamb commented on code in PR #15301:
URL: https://github.com/apache/datafusion/pull/15301#discussion_r2021602649
##########
datafusion/common/src/config.rs:
##########
@@ -590,6 +590,13 @@ config_namespace! {
/// during aggregations, if possible
pub enable_topk_aggregation: bool, default = true
+ /// When set to true attempts to push down dynamic filters generated
by operators into the file scan phase.
+ /// For example, for a query such as `SELECT * FROM t ORDER BY
timestamp DESC LIMIT 10`, the optimizer
+ /// will attempt to push down the current top 10 timestamps that the
TopK operator references into the file scans.
+ /// This means that if we already have 10 timestamps in the year 2025
+ /// any files that only have timestamps in the year 2024 can be
skipped / pruned at various stages in the scan.
Review Comment:
this is a great example
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]