alamb commented on code in PR #5819:
URL: https://github.com/apache/arrow-datafusion/pull/5819#discussion_r1155093800
##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -1118,6 +1119,22 @@ impl<'a, S: SimplifyInfo> TreeNodeRewriter for
Simplifier<'a, S> {
right,
}) => simplify_regex_expr(left, op, right)?,
+ // Rules for Like
+ Expr::Like(Like {
+ expr,
+ pattern,
+ negated,
+ escape_char: _,
+ }) if !is_null(&expr) && pattern.eq(&Box::new(lit("%"))) =>
lit(!negated),
Review Comment:
By writing `Box::new(...)` I think this is goign to do two allocations each
time (one for the Box and one for the String in the literal)
Would it be possible to rewrite this to a match statement without allocation?
```rust
matches!(pattern.as_ref(), Expression::Literal(ScalarValue::Utf8(..)))
```
?
You might have to define some better way to check for the literal scalar
##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -2320,16 +2337,10 @@ mod tests {
assert_no_change(regex_match(col("c1"), lit("f_o")));
// empty cases
- assert_change(regex_match(col("c1"), lit("")), like(col("c1"), "%"));
Review Comment:
🤔 the original rewrite rule looks incorrect to me (as in the orignal rule
shouldn't be rewriting regexp match). We can fix this in some other PR
```sql
postgres=# select regexp_match('foo', '');
regexp_match
--------------
{""}
(1 row)
postgres=# select like('foo', '%');
like
------
t
(1 row)
```
##########
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##########
@@ -2851,4 +2838,70 @@ mod tests {
or(col("c2").lt(lit(3)), col("c2").gt(lit(4)))
);
}
+
+ #[test]
+ fn test_like_and_ilke() {
+ // test non-null values
+ let expr = Expr::Like(Like::new(
+ false,
+ Box::new(col("c1")),
+ Box::new(lit("%")),
+ None,
+ ));
Review Comment:
What do you think about using the expr_fn's style for creating `like`
https://docs.rs/datafusion/latest/datafusion/prelude/enum.Expr.html#method.like
```suggestion
let expr = like(col("c1"), lit("%"));
```
I think that would make it clearer what cases these tests were covering
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]