adriangb commented on code in PR #13795:
URL: https://github.com/apache/datafusion/pull/13795#discussion_r1888777746


##########
datafusion/physical-optimizer/src/pruning.rs:
##########
@@ -287,7 +287,12 @@ pub trait PruningStatistics {
 ///   predicate can never possibly be true). The container can be pruned 
(skipped)
 ///   entirely.
 ///
-/// Note that in order to be correct, `PruningPredicate` must return false
+/// While `PruningPredicate` will never return a `NULL` value, the
+/// rewritten predicate (as returned by `build_predicate_expression` and used 
internally
+/// by `PruningPredicate`) may evaluate to `NULL` when some of the min/max 
values
+/// or null / row counts are not known.

Review Comment:
   This has always been true and is also clarified in the same docstring lower 
down, I just wanted to add it here again since it's caused confusion in the 
past (even for @alamb !): 
https://github.com/apache/datafusion/blob/f4e65d2d9711ed097982d2fbde4191c402c05023/datafusion/physical-optimizer/src/pruning.rs#L300-L316
   
   The difference now is that if the null or row count is null we will also 
return null in the case where we can't use the min/max stats to prove that the 
file can be pruned.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to