petern48 commented on code in PR #17835:
URL: https://github.com/apache/datafusion/pull/17835#discussion_r2389665889


##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -1436,33 +1436,59 @@ impl<S: SimplifyInfo> TreeNodeRewriter for 
Simplifier<'_, S> {
 
             // CASE WHEN true THEN A ... END --> A
             // CASE WHEN X THEN A WHEN TRUE THEN B ... END --> CASE WHEN X 
THEN A ELSE B END
+            // CASE WHEN false THEN A END --> NULL
+            // CASE WHEN false THEN A ELSE B END --> B
+            // CASE WHEN X THEN A WHEN false THEN B END --> CASE WHEN X THEN A 
ELSE B END
             Expr::Case(Case {
                 expr: None,
                 mut when_then_expr,
-                else_expr: _,
+                mut else_expr,
                 // if let guard is not stabilized so we can't use it yet: 
https://github.com/rust-lang/rust/issues/51114
                 // Once it's supported we can avoid searching through 
when_then_expr twice in the below .any() and .position() calls
                 // }) if let Some(i) = when_then_expr.iter().position(|(when, 
_)| is_true(when.as_ref())) => {
             }) if when_then_expr
                 .iter()
-                .any(|(when, _)| is_true(when.as_ref())) =>
+                .any(|(when, _)| is_true(when.as_ref()) || 
is_false(when.as_ref())) =>
             {
-                let i = when_then_expr
-                    .iter()
-                    .position(|(when, _)| is_true(when.as_ref()))
-                    .unwrap();
-                let (_, then_) = when_then_expr.swap_remove(i);
-                // CASE WHEN true THEN A ... END --> A
-                if i == 0 {
-                    return Ok(Transformed::yes(*then_));
+                let mut remove_indices = 
Vec::with_capacity(when_then_expr.len());
+                let out_type = info.get_data_type(&when_then_expr[0].1)?;

Review Comment:
   Introducing this `get_data_type` call made some of the existing tests fail 
because it was trying to get the data type of a column that didn't exist in the 
schema. I updated the existing tests to use the actual column names e.g 
(`col("c1")`, `col("c3")`) or string literals (e.g lit("a")) instead of the 
invalid column names (e.g `col("a")`) hence why so many random changes in the 
old tests. When I ran queries in the CLI, it seemed like Datafusion was 
catching the invalid column names before it got to this code, so I think this 
should be safe.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to