pepijnve commented on code in PR #17973:
URL: https://github.com/apache/datafusion/pull/17973#discussion_r2418438428
##########
datafusion/physical-expr/src/expressions/case.rs:
##########
@@ -431,6 +428,15 @@ impl CaseExpr {
_ => Cow::Owned(prep_null_mask_filter(when_value)),
};
+ let true_count = when_value.true_count();
+ if true_count == batch.num_rows() {
+ // Avoid evaluate_selection when all rows are true
+ return self.when_then_expr[0].1.evaluate(batch);
+ } else if true_count == 0 {
+ // Avoid evaluate_selection when all rows are false/null
+ return self.else_expr.as_ref().unwrap().evaluate(batch);
+ }
+
Review Comment:
I woke up early this morning to the realisation that the actual bug was a
subtlety in the implementation of `evaluate_selection`. There's a difference
between calling it with the empty set vs a non-empty set and a false selection
vector. The implementation was actually treating both cases identically which
can cause a spurious row to get materialised. I've pushed a correction for this
and tweaked the comments in the code a bit.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]