adriangb commented on PR #22460:
URL: https://github.com/apache/datafusion/pull/22460#issuecomment-4528965547

   **Deferred follow-up: tighten the open-time `FilePruner` creation gate → 
#22495**
   
   This PR keeps the gate behavior-preserving: `contains_dynamic_filter(p) || 
has_statistics()` (just swapping the old `is_dynamic_physical_expr` for the 
downcast-based `contains_dynamic_filter`, so the opener no longer depends on 
`snapshot_generation`).
   
   `contains_dynamic_filter` is a loose proxy for "can this file be pruned 
without column statistics" (i.e. a conjunct that folds to a constant via 
partition values). The precise, false-positive-free check is a per-conjunct 
"references only partition/constant columns" test — e.g. `part = a` (partition 
vs data column) must *not* qualify, while `part = 2 AND a > 5` must. Because 
real Parquet files almost always carry statistics (so `has_statistics()` 
short-circuits the gate), this only affects stats-less files and is a behavior 
change, so it's tracked separately in #22495 rather than folded into this 
refactor.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to