alamb commented on code in PR #8957:
URL: https://github.com/apache/arrow-datafusion/pull/8957#discussion_r1463146963
##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -281,7 +280,13 @@ impl<'a> TreeNodeRewriter for ConstEvaluator<'a> {
fn mutate(&mut self, expr: Expr) -> Result<Expr> {
match self.can_evaluate.pop() {
- Some(true) => Ok(Expr::Literal(self.evaluate_to_scalar(expr)?)),
+ Some(true) => {
+ let lit = self.evaluate_to_scalar(expr.clone());
+ match lit {
+ Ok(lit) => Ok(Expr::Literal(lit)),
+ Err(_) => Ok(expr),
+ }
+ }
Review Comment:
This change makes sense to me. Can we please add some rationale /
explanation here about *why* the error is ignored so future people who read
this code understand
Perhaps something like
```rust
// Certain expressions such as `CASE` and `COALESCE` are short circuiting
// and may not evalute all their sub expressions. Thus if
// if any error is countered during simplification, return the original
// so that normal evaluation can occur
```
##########
datafusion/optimizer/src/simplify_expressions/simplify_exprs.rs:
##########
@@ -138,28 +138,6 @@ mod tests {
ExprSchemable, JoinType,
};
- /// A macro to assert that one string is contained within another with
Review Comment:
💯 for reusing the existing one
##########
datafusion/physical-expr/src/expressions/case.rs:
##########
@@ -148,6 +148,11 @@ impl CaseExpr {
// Make sure we only consider rows that have not been matched yet
let when_match = and(&when_match, &remainder)?;
+ // When no rows available for when clause, skip then clause
+ if when_match.true_count() == 0 {
+ continue;
+ }
+
Review Comment:
```rust
// When no rows available for when clause, skip then clause
if when_match.true_count() == 0 {
continue;
}
```
I souble checked that this method accounts for nulls 👍
https://docs.rs/arrow/latest/arrow/array/struct.BooleanArray.html#method.true_count
##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -1792,27 +1768,6 @@ mod tests {
assert_eq!(simplify(expr), expected);
}
- #[test]
- fn test_simplify_divide_zero_by_zero() {
Review Comment:
Can we please ensure this new ignore error behavior is covered in these unit
tests?
Perhaps you could update `test_simplify_divide_by_zero` and instead expect
that no rewrite occurs rather than deleting the entire test.
Also, I think it is important to have a test that shows partial
simplification occurs even when part of the expression errors
For example, perhaps we can add a test like:
```
(1 + 2) + (4/0) --> 3 + (4/0)
```
##########
datafusion/physical-expr/src/expressions/case.rs:
##########
@@ -148,6 +148,11 @@ impl CaseExpr {
// Make sure we only consider rows that have not been matched yet
let when_match = and(&when_match, &remainder)?;
+ // When no rows available for when clause, skip then clause
+ if when_match.true_count() == 0 {
+ continue;
+ }
+
Review Comment:
> so the query like below, will get divide by zero error.
>
> `select case 1 when 2 the 4/0 end;`
Is that case covered by existing tests? If not, can you please add a test?
##########
datafusion/sqllogictest/test_files/arrow_typeof.slt:
##########
@@ -405,7 +405,7 @@ select arrow_cast([1], 'FixedSizeList(1, Int64)');
----
[1]
-query error DataFusion error: Optimizer rule 'simplify_expressions' failed
+query error
Review Comment:
would it be possible to update the slt files with the new error message ? It
would be easier to validate the test still makes sense
##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -281,7 +280,13 @@ impl<'a> TreeNodeRewriter for ConstEvaluator<'a> {
fn mutate(&mut self, expr: Expr) -> Result<Expr> {
match self.can_evaluate.pop() {
- Some(true) => Ok(Expr::Literal(self.evaluate_to_scalar(expr)?)),
+ Some(true) => {
+ let lit = self.evaluate_to_scalar(expr.clone());
Review Comment:
It seems a real shame that this has to clone the expression even for the
common case where there is no error
maybe we could use something like the following:
```rust
enum SimplifyResult {
// expr was simplifed and contains the new expression
Simplified(Expr),
// Evalaution encountered an error, contains the original expression
SimplifyRuntimeError(DataFusionError, Expr),
}
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]