adriangb commented on PR #22460: URL: https://github.com/apache/datafusion/pull/22460#issuecomment-4528965547
**Deferred follow-up: tighten the open-time `FilePruner` creation gate → #22495** This PR keeps the gate behavior-preserving: `contains_dynamic_filter(p) || has_statistics()` (just swapping the old `is_dynamic_physical_expr` for the downcast-based `contains_dynamic_filter`, so the opener no longer depends on `snapshot_generation`). `contains_dynamic_filter` is a loose proxy for "can this file be pruned without column statistics" (i.e. a conjunct that folds to a constant via partition values). The precise, false-positive-free check is a per-conjunct "references only partition/constant columns" test — e.g. `part = a` (partition vs data column) must *not* qualify, while `part = 2 AND a > 5` must. Because real Parquet files almost always carry statistics (so `has_statistics()` short-circuits the gate), this only affects stats-less files and is a behavior change, so it's tracked separately in #22495 rather than folded into this refactor. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
