alamb commented on code in PR #3044:
URL: https://github.com/apache/arrow-datafusion/pull/3044#discussion_r938932706
##########
datafusion/core/src/physical_optimizer/pruning.rs:
##########
@@ -1761,4 +1819,39 @@ mod tests {
let result = p.prune(&statistics).unwrap();
assert_eq!(result, expected_ret);
}
+
+ #[test]
+ fn prune_int32_is_null() {
+ let (schema, statistics) = int32_setup();
+
+ // Expression "i IS NULL" when there are no null statistics,
+ // should all be kept
+ let expected_ret = vec![true, true, true, true, true];
+
+ // i IS NULL, no null statistics
+ let expr = col("i").is_null();
+ let p = PruningPredicate::try_new(expr, schema.clone()).unwrap();
Review Comment:
prior to the fix, this line would panic
##########
datafusion/core/src/physical_optimizer/pruning.rs:
##########
@@ -1761,4 +1819,39 @@ mod tests {
let result = p.prune(&statistics).unwrap();
assert_eq!(result, expected_ret);
}
+
+ #[test]
+ fn prune_int32_is_null() {
+ let (schema, statistics) = int32_setup();
+
+ // Expression "i IS NULL" when there are no null statistics,
+ // should all be kept
+ let expected_ret = vec![true, true, true, true, true];
+
+ // i IS NULL, no null statistics
+ let expr = col("i").is_null();
+ let p = PruningPredicate::try_new(expr, schema.clone()).unwrap();
+ let result = p.prune(&statistics).unwrap();
+ assert_eq!(result, expected_ret);
+
+ // provide null counts for each column
+ let statistics = statistics.with_null_counts(
+ "i",
+ vec![
+ Some(0), // no nulls (don't keep)
+ Some(1), // 1 null
+ None, // unknown nulls
+ None, // unknown nulls (min/max are both null too, like no
stats at all)
+ Some(0), // 0 nulls (max=null too which means no known max)
(don't keep)
+ ],
+ );
+
+ let expected_ret = vec![false, true, true, true, false];
Review Comment:
This case simply didn't have coverage before that I could find
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]