pepijnve commented on code in PR #17973:
URL: https://github.com/apache/datafusion/pull/17973#discussion_r2418438428


##########
datafusion/physical-expr/src/expressions/case.rs:
##########
@@ -431,6 +428,15 @@ impl CaseExpr {
             _ => Cow::Owned(prep_null_mask_filter(when_value)),
         };
 
+        let true_count = when_value.true_count();
+        if true_count == batch.num_rows() {
+            // Avoid evaluate_selection when all rows are true
+            return self.when_then_expr[0].1.evaluate(batch);
+        } else if true_count == 0 {
+            // Avoid evaluate_selection when all rows are false/null
+            return self.else_expr.as_ref().unwrap().evaluate(batch);
+        }
+

Review Comment:
   I woke up early this morning to the realisation that the actual bug was a 
subtlety in the implementation of `evaluate_selection`. There's a difference 
between calling it with the empty set vs a non-empty set and a false selection 
vector. The implementation was actually treating both cases identically which 
can cause a spurious row to get materialised. I've pushed a correction for this 
and tweaked the comments in the code a bit.
   
   I believe this properly addresses the original evaluation problem. All SQL 
logic tests pass even when commenting out the optimisation for true and false 
selection vectors in `expr_or_expr`.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to