adriangb commented on code in PR #13795: URL: https://github.com/apache/datafusion/pull/13795#discussion_r1888777746
########## datafusion/physical-optimizer/src/pruning.rs: ########## @@ -287,7 +287,12 @@ pub trait PruningStatistics { /// predicate can never possibly be true). The container can be pruned (skipped) /// entirely. /// -/// Note that in order to be correct, `PruningPredicate` must return false +/// While `PruningPredicate` will never return a `NULL` value, the +/// rewritten predicate (as returned by `build_predicate_expression` and used internally +/// by `PruningPredicate`) may evaluate to `NULL` when some of the min/max values +/// or null / row counts are not known. Review Comment: This has always been true and is also clarified in the same docstring lower down, I just wanted to add it here again since it's caused confusion in the past (even for @alamb !): https://github.com/apache/datafusion/blob/f4e65d2d9711ed097982d2fbde4191c402c05023/datafusion/physical-optimizer/src/pruning.rs#L300-L316 The difference now is that if the null or row count is null we will also return null in the case where we can't use the min/max stats to prove that the file can be pruned. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org