pepijnve commented on code in PR #17973:
URL: https://github.com/apache/datafusion/pull/17973#discussion_r2417677093


##########
datafusion/physical-expr/src/expressions/case.rs:
##########
@@ -431,6 +428,15 @@ impl CaseExpr {
             _ => Cow::Owned(prep_null_mask_filter(when_value)),
         };
 
+        let true_count = when_value.true_count();
+        if true_count == batch.num_rows() {
+            // Avoid evaluate_selection when all rows are true
+            return self.when_then_expr[0].1.evaluate(batch);
+        } else if true_count == 0 {
+            // Avoid evaluate_selection when all rows are false/null
+            return self.else_expr.as_ref().unwrap().evaluate(batch);
+        }
+

Review Comment:
   > My reading of this code is that it will still evaluate the then expression 
as long as there is at least one true value in when
   
   Yes, that's correct. There's no way to avoid that.
   
   This particular bit of code is both an optimisation and a correctness thing.
   
   From a performance point of view, we already know the selection vector is 
redundant, so there's really no point in calling `evaluate_selection`.
   
   For correctness, what's being avoid here is calling either `then` or `else` 
with a selection vector that will result in an empty record batch after 
filtering. We could add similar checks in `evaluate_selection` to prevent 
evaluating the downstream expression for empty record batches as well. Its 
current contract requires it to return an array with the same length as the 
unfiltered input batch though. You can't avoid having to create an all-nulls 
array then.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to