Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
alamb merged PR #13260: URL: https://github.com/apache/datafusion/pull/13260 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
alamb commented on PR #13260: URL: https://github.com/apache/datafusion/pull/13260#issuecomment-2465520257 Thanks @findepi @adriangb and @goldmedal for this PR and the reviews -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
findepi commented on PR #13260: URL: https://github.com/apache/datafusion/pull/13260#issuecomment-2465026990 Applied comments and fixed handling of implicit escape. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
findepi commented on code in PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#discussion_r1834565613
##
datafusion/sqllogictest/test_files/string/string_view.slt:
##
@@ -396,8 +396,9 @@ EXPLAIN SELECT
FROM test;
logical_plan
-01)Projection: test.column1_utf8view LIKE Utf8View("foo") AS like,
test.column1_utf8view ILIKE Utf8View("foo") AS ilike
-02)--TableScan: test projection=[column1_utf8view]
+01)Projection: __common_expr_1 AS like, __common_expr_1 AS ilike
Review Comment:
Done
##
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##
@@ -3633,32 +3697,123 @@ mod tests {
#[test]
fn test_like_and_ilke() {
-// LIKE '%'
-let expr = like(col("c1"), "%");
+let null = lit(ScalarValue::Utf8(None));
+
+// expr [NOT] [I]LIKE NULL
+let expr = like(col("c1"), null.clone());
Review Comment:
Let's follow-up
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
findepi commented on PR #13260: URL: https://github.com/apache/datafusion/pull/13260#issuecomment-2464972049 (just rebased) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
findepi commented on PR #13260: URL: https://github.com/apache/datafusion/pull/13260#issuecomment-2464969396 The conflicts are because https://github.com/apache/datafusion/pull/13288 is now merged. Let me rebase. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
alamb commented on PR #13260: URL: https://github.com/apache/datafusion/pull/13260#issuecomment-2462907348 Looks like this branch has some conflicts now to resolve but then I think it will be ready to god -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
goldmedal commented on code in PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#discussion_r1832707088
##
datafusion/sqllogictest/test_files/string/string_query.slt.part:
##
@@ -873,41 +873,30 @@ NULL NULL NULL NULL NULL
#Raphael datafusionДатаФусион false false false false
#NULL NULL NULL NULL NULL NULL
-# TODO (https://github.com/apache/datafusion/issues/12637) uncomment
additional test projections
-query TTBB
+query TT
SELECT ascii_1, unicode_1,
ascii_1 LIKE '%' AS ascii_1_like_percent,
- unicode_1 LIKE '%' AS unicode_1_like_percent
- -- ascii_1 LIKE '%%' AS ascii_1_like_percent_percent, -- TODO enable
after fixing https://github.com/apache/datafusion/issues/12637
- -- unicode_1 LIKE '%%' AS unicode_1_like_percent_percent -- TODO enable
after fixing https://github.com/apache/datafusion/issues/12637
+ unicode_1 LIKE '%' AS unicode_1_like_percent,
+ ascii_1 LIKE '%%' AS ascii_1_like_percent_percent,
+ unicode_1 LIKE '%%' AS unicode_1_like_percent_percent
FROM test_basic_operator
-Andrew datafusion📊🔥 true true
-Xiangpeng datafusion数据融合 true true
-Raphael datafusionДатаФусион true true
-under_score un iść core true true
-percent pan Tadeusz ma iść w kąt true true
-(empty) (empty) true true
-NULL NULL NULL NULL
-NULL NULL NULL NULL
+Andrew datafusion📊🔥 true true true true
+Xiangpeng datafusion数据融合 true true true true
+Raphael datafusionДатаФусион true true true true
+under_score un iść core true true true true
+percent pan Tadeusz ma iść w kąt true true true true
+(empty) (empty) true true true true
+NULL NULL NULL NULL NULL NULL
+NULL NULL NULL NULL NULL NULL
-# TODO (https://github.com/apache/datafusion/issues/12637) uncomment
additional test projections
-query TTBB
+query error DataFusion error: SQL error: ParserError\("Expected an expression,
found: FROM"\)
SELECT ascii_1, unicode_1,
ascii_1 NOT LIKE '%' AS ascii_1_not_like_percent,
- unicode_1 NOT LIKE '%' AS unicode_1_not_like_percent
- -- ascii_1 NOT LIKE '%%' AS ascii_1_not_like_percent_percent, -- TODO
enable after fixing https://github.com/apache/datafusion/issues/12637
- -- unicode_1 NOT LIKE '%%' AS unicode_1_not_like_percent_percent -- TODO
enable after fixing https://github.com/apache/datafusion/issues/12637
+ unicode_1 NOT LIKE '%' AS unicode_1_not_like_percent,
+ ascii_1 NOT LIKE '%%' AS ascii_1_not_like_percent_percent,
+ unicode_1 NOT LIKE '%%' AS unicode_1_not_like_percent_percent,
Review Comment:
```suggestion
unicode_1 NOT LIKE '%%' AS unicode_1_not_like_percent_percent
```
A redundant comma causes the parsing error.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
findepi commented on code in PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#discussion_r1832420550
##
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##
@@ -3633,32 +3697,123 @@ mod tests {
#[test]
fn test_like_and_ilke() {
-// LIKE '%'
-let expr = like(col("c1"), "%");
+let null = lit(ScalarValue::Utf8(None));
+
+// expr [NOT] [I]LIKE NULL
+let expr = like(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = not_like(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = ilike(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = not_ilike(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+// expr [NOT] [I]LIKE '%'
+let expr = like(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), true));
+
+let expr = not_like(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), false));
+
+let expr = ilike(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), true));
+
+let expr = not_ilike(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), false));
+
+// expr [NOT] [I]LIKE '%%'
Review Comment:
will temporarily rebase on top of
https://github.com/apache/datafusion/pull/13288 for more test coverage
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
findepi commented on PR #13260: URL: https://github.com/apache/datafusion/pull/13260#issuecomment-2461793835 FYI, when testing LIKE patterns with potential escapes, beware of difference between sqllogictests and CLI when it comes to backslash (like implicit escape) -- https://github.com/apache/datafusion/issues/13286 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
findepi commented on code in PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#discussion_r1832332042
##
datafusion/sqllogictest/test_files/string/string_query.slt.part:
##
@@ -873,41 +873,30 @@ NULL NULL NULL NULL NULL
#Raphael datafusionДатаФусион false false false false
#NULL NULL NULL NULL NULL NULL
-# TODO (https://github.com/apache/datafusion/issues/12637) uncomment
additional test projections
-query TTBB
+query TT
SELECT ascii_1, unicode_1,
ascii_1 LIKE '%' AS ascii_1_like_percent,
- unicode_1 LIKE '%' AS unicode_1_like_percent
- -- ascii_1 LIKE '%%' AS ascii_1_like_percent_percent, -- TODO enable
after fixing https://github.com/apache/datafusion/issues/12637
- -- unicode_1 LIKE '%%' AS unicode_1_like_percent_percent -- TODO enable
after fixing https://github.com/apache/datafusion/issues/12637
+ unicode_1 LIKE '%' AS unicode_1_like_percent,
+ ascii_1 LIKE '%%' AS ascii_1_like_percent_percent,
+ unicode_1 LIKE '%%' AS unicode_1_like_percent_percent
FROM test_basic_operator
-Andrew datafusion📊🔥 true true
-Xiangpeng datafusion数据融合 true true
-Raphael datafusionДатаФусион true true
-under_score un iść core true true
-percent pan Tadeusz ma iść w kąt true true
-(empty) (empty) true true
-NULL NULL NULL NULL
-NULL NULL NULL NULL
+Andrew datafusion📊🔥 true true true true
+Xiangpeng datafusion数据融合 true true true true
+Raphael datafusionДатаФусион true true true true
+under_score un iść core true true true true
+percent pan Tadeusz ma iść w kąt true true true true
+(empty) (empty) true true true true
+NULL NULL NULL NULL NULL NULL
+NULL NULL NULL NULL NULL NULL
-# TODO (https://github.com/apache/datafusion/issues/12637) uncomment
additional test projections
-query TTBB
+query error DataFusion error: SQL error: ParserError\("Expected an expression,
found: FROM"\)
Review Comment:
bug in the test query
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
findepi commented on code in PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#discussion_r1832239184
##
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##
@@ -3633,32 +3697,123 @@ mod tests {
#[test]
fn test_like_and_ilke() {
-// LIKE '%'
-let expr = like(col("c1"), "%");
+let null = lit(ScalarValue::Utf8(None));
+
+// expr [NOT] [I]LIKE NULL
+let expr = like(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = not_like(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = ilike(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = not_ilike(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+// expr [NOT] [I]LIKE '%'
+let expr = like(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), true));
+
+let expr = not_like(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), false));
+
+let expr = ilike(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), true));
+
+let expr = not_ilike(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), false));
+
+// expr [NOT] [I]LIKE '%%'
Review Comment:
The `%%` simplification is wrong. `\` is implicit escape. Let me update.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
findepi commented on code in PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#discussion_r1831797615
##
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##
@@ -3633,32 +3697,123 @@ mod tests {
#[test]
fn test_like_and_ilke() {
-// LIKE '%'
-let expr = like(col("c1"), "%");
+let null = lit(ScalarValue::Utf8(None));
+
+// expr [NOT] [I]LIKE NULL
+let expr = like(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = not_like(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = ilike(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = not_ilike(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+// expr [NOT] [I]LIKE '%'
+let expr = like(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), true));
+
+let expr = not_like(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), false));
+
+let expr = ilike(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), true));
+
+let expr = not_ilike(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), false));
+
+// expr [NOT] [I]LIKE '%%'
Review Comment:
> In Postgres, `%%` doesn't mean matching the actual literal '%'. The escape
character is `\`.
thanks @goldmedal for checking this.
in our case, we don't have default escape character at all, i believe.
at least we don't define it on Like object.
i need to check whether we have an _implicit_ default escape character
applied somewhere later.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
alamb commented on code in PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#discussion_r1831473963
##
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##
@@ -1470,34 +1471,67 @@ impl<'a, S: SimplifyInfo> TreeNodeRewriter for
Simplifier<'a, S> {
}) => Transformed::yes(simplify_regex_expr(left, op, right)?),
// Rules for Like
-Expr::Like(Like {
-expr,
-pattern,
-negated,
-escape_char: _,
-case_insensitive: _,
-}) if matches!(
-pattern.as_ref(),
-Expr::Literal(ScalarValue::Utf8(Some(pattern_str))) if
pattern_str == "%"
-) || matches!(
-pattern.as_ref(),
-Expr::Literal(ScalarValue::LargeUtf8(Some(pattern_str))) if
pattern_str == "%"
-) || matches!(
-pattern.as_ref(),
-Expr::Literal(ScalarValue::Utf8View(Some(pattern_str))) if
pattern_str == "%"
-) =>
-{
-// exp LIKE '%' is
-// - when exp is not NULL, it's true
-// - when exp is NULL, it's NULL
-// exp NOT LIKE '%' is
-// - when exp is not NULL, it's false
-// - when exp is NULL, it's NULL
-Transformed::yes(Expr::Case(Case {
-expr: Some(Box::new(Expr::IsNotNull(expr))),
-when_then_expr: vec![(Box::new(lit(true)),
Box::new(lit(!negated)))],
-else_expr: None,
-}))
+Expr::Like(like) => {
+match as_string_scalar(&like.pattern) {
+Some((data_type, pattern_str)) => {
+match pattern_str {
+None => return
Ok(Transformed::yes(lit_bool_null())),
+Some(pattern_str) if pattern_str == "%" => {
+// exp LIKE '%' is
+// - when exp is not NULL, it's true
+// - when exp is NULL, it's NULL
+// exp NOT LIKE '%' is
+// - when exp is not NULL, it's false
+// - when exp is NULL, it's NULL
+let result_for_non_null = lit(!like.negated);
+Transformed::yes(if
!info.nullable(&like.expr)? {
+result_for_non_null
+} else {
+Expr::Case(Case {
+expr:
Some(Box::new(Expr::IsNotNull(like.expr))),
+when_then_expr: vec![(
+Box::new(lit(true)),
+Box::new(result_for_non_null),
+)],
+else_expr: None,
+})
+})
+}
+Some(pattern_str)
Review Comment:
```suggestion
Some(pattern_str)
// Repeated occurrences of wildcard are
redundant so remove them
// exp LIKE '%%' --> exp LIKE '%'
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
alamb commented on code in PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#discussion_r1831471559
##
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##
@@ -3633,32 +3697,123 @@ mod tests {
#[test]
fn test_like_and_ilke() {
-// LIKE '%'
-let expr = like(col("c1"), "%");
+let null = lit(ScalarValue::Utf8(None));
+
+// expr [NOT] [I]LIKE NULL
+let expr = like(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = not_like(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = ilike(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = not_ilike(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+// expr [NOT] [I]LIKE '%'
+let expr = like(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), true));
+
+let expr = not_like(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), false));
+
+let expr = ilike(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), true));
+
+let expr = not_ilike(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), false));
+
+// expr [NOT] [I]LIKE '%%'
Review Comment:
I was definitely confused -- thank you
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
crepererum commented on PR #13260: URL: https://github.com/apache/datafusion/pull/13260#issuecomment-2460111460 since there are already two reviewers on it, I'll gonna skip this PR. However if you need my input, ping me. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
goldmedal commented on code in PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#discussion_r1830993512
##
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##
@@ -3633,32 +3697,123 @@ mod tests {
#[test]
fn test_like_and_ilke() {
-// LIKE '%'
-let expr = like(col("c1"), "%");
+let null = lit(ScalarValue::Utf8(None));
+
+// expr [NOT] [I]LIKE NULL
+let expr = like(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = not_like(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = ilike(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = not_ilike(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+// expr [NOT] [I]LIKE '%'
+let expr = like(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), true));
+
+let expr = not_like(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), false));
+
+let expr = ilike(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), true));
+
+let expr = not_ilike(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), false));
+
+// expr [NOT] [I]LIKE '%%'
Review Comment:
> I thought `expr LIKE '%%'` means the same as `expr = '%'` (match the
actual literal character `%` )
>
> Specifically that `%%` is how to escape the wildcard to not be a wildcard
I think it depends on how we define it. I assume we prefer to follow the
Postgres behavior. 🤔
In Postgres, `%%` doesn't mean matching the actual literal '%'. The escape
character is `\`.
```
test=# select '1' like '%%';
?column?
--
t
(1 row)
test=# select '%' like '%%';
?column?
--
t
(1 row)
test=# select '%' like '\%';
?column?
--
t
(1 row)
test=# select '1' like '\%';
?column?
--
f
(1 row)
```
I think it is just a redundant wildcard. So, we can simplify it to be only
one `%`.
By the way, in DuckDB, there is a similar behavior for double '%%'.
```
D select '1' like '%%';
┌───┐
│ ('1' ~~ '%%') │
│boolean│
├───┤
│ true │
└───┘
D select '%' like '%%';
┌───┐
│ ('%' ~~ '%%') │
│boolean│
├───┤
│ true │
└───┘
However, the escape character should be set by `ESCAPE` syntax
```
D select '%' like '\%';
┌───┐
│ ('%' ~~ '\%') │
│boolean│
├───┤
│ false │
└───┘
D select '%' like '\%' escape '\';
┌──┐
│ main.like_escape('%', '\%', '\') │
│ boolean │
├──┤
│ true │
└──┘
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
goldmedal commented on code in PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#discussion_r1830993512
##
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##
@@ -3633,32 +3697,123 @@ mod tests {
#[test]
fn test_like_and_ilke() {
-// LIKE '%'
-let expr = like(col("c1"), "%");
+let null = lit(ScalarValue::Utf8(None));
+
+// expr [NOT] [I]LIKE NULL
+let expr = like(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = not_like(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = ilike(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = not_ilike(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+// expr [NOT] [I]LIKE '%'
+let expr = like(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), true));
+
+let expr = not_like(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), false));
+
+let expr = ilike(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), true));
+
+let expr = not_ilike(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), false));
+
+// expr [NOT] [I]LIKE '%%'
Review Comment:
> I thought `expr LIKE '%%'` means the same as `expr = '%'` (match the
actual literal character `%` )
>
> Specifically that `%%` is how to escape the wildcard to not be a wildcard
I think it depends on how we define it. I assume we prefer to follow the
Postgres behavior. 🤔
In Postgres, `%%` doesn't mean matching the actual literal '%'. In Postgres,
the escape character is `\`.
```
test=# select '1' like '%%';
?column?
--
t
(1 row)
test=# select '%' like '%%';
?column?
--
t
(1 row)
test=# select '%' like '\%';
?column?
--
t
(1 row)
test=# select '1' like '\%';
?column?
--
f
(1 row)
```
I think it is just a redundant wildcard. So, we can simplify it to be only
one `%`.
By the way, in DuckDB, there is a similar behavior for double '%%'.
```
D select '1' like '%%';
┌───┐
│ ('1' ~~ '%%') │
│boolean│
├───┤
│ true │
└───┘
D select '%' like '%%';
┌───┐
│ ('%' ~~ '%%') │
│boolean│
├───┤
│ true │
└───┘
However, the escape character should be set by `ESCAPE` syntax
```
D select '%' like '\%';
┌───┐
│ ('%' ~~ '\%') │
│boolean│
├───┤
│ false │
└───┘
D select '%' like '\%' escape '\';
┌──┐
│ main.like_escape('%', '\%', '\') │
│ boolean │
├──┤
│ true │
└──┘
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
goldmedal commented on code in PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#discussion_r1830993512
##
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##
@@ -3633,32 +3697,123 @@ mod tests {
#[test]
fn test_like_and_ilke() {
-// LIKE '%'
-let expr = like(col("c1"), "%");
+let null = lit(ScalarValue::Utf8(None));
+
+// expr [NOT] [I]LIKE NULL
+let expr = like(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = not_like(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = ilike(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = not_ilike(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+// expr [NOT] [I]LIKE '%'
+let expr = like(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), true));
+
+let expr = not_like(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), false));
+
+let expr = ilike(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), true));
+
+let expr = not_ilike(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), false));
+
+// expr [NOT] [I]LIKE '%%'
Review Comment:
> I thought `expr LIKE '%%'` means the same as `expr = '%'` (match the
actual literal character `%` )
>
> Specifically that `%%` is how to escape the wildcard to not be a wildcard
I think it depends on how we define it. I assume we prefer to follow the
Postgres behavior. 🤔
In Postgres, `%%` doesn't mean matching the actual literal '%'. The escape
character is `\`.
```
test=# select '1' like '%%';
?column?
--
t
(1 row)
test=# select '%' like '%%';
?column?
--
t
(1 row)
test=# select '%' like '\%';
?column?
--
t
(1 row)
test=# select '1' like '\%';
?column?
--
f
(1 row)
```
I think `%%` is just a redundant wildcard. So, we can simplify it to be only
one `%`.
By the way, in DuckDB, there is a similar behavior for double '%%'.
```
D select '1' like '%%';
┌───┐
│ ('1' ~~ '%%') │
│boolean│
├───┤
│ true │
└───┘
D select '%' like '%%';
┌───┐
│ ('%' ~~ '%%') │
│boolean│
├───┤
│ true │
└───┘
However, the escape character should be set by `ESCAPE` syntax
```
D select '%' like '\%';
┌───┐
│ ('%' ~~ '\%') │
│boolean│
├───┤
│ false │
└───┘
D select '%' like '\%' escape '\';
┌──┐
│ main.like_escape('%', '\%', '\') │
│ boolean │
├──┤
│ true │
└──┘
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
goldmedal commented on code in PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#discussion_r1831061785
##
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##
@@ -3633,32 +3697,123 @@ mod tests {
#[test]
fn test_like_and_ilke() {
-// LIKE '%'
-let expr = like(col("c1"), "%");
+let null = lit(ScalarValue::Utf8(None));
+
+// expr [NOT] [I]LIKE NULL
+let expr = like(col("c1"), null.clone());
Review Comment:
```suggestion
let expr = col("c1").like(null.clone());
```
As @alamb's comment
(https://github.com/apache/datafusion/pull/13260#discussion_r1830873028), we
can do some refactor like this.
##
datafusion/sqllogictest/test_files/string/string_view.slt:
##
@@ -396,8 +396,9 @@ EXPLAIN SELECT
FROM test;
logical_plan
-01)Projection: test.column1_utf8view LIKE Utf8View("foo") AS like,
test.column1_utf8view ILIKE Utf8View("foo") AS ilike
-02)--TableScan: test projection=[column1_utf8view]
+01)Projection: __common_expr_1 AS like, __common_expr_1 AS ilike
Review Comment:
Agreed. I did some tests. I think they still use the native StringView
implementation for pattern matching.
```
query TT
EXPLAIN SELECT
column1_utf8view like '%foo%' as "like",
column1_utf8view ilike '%foo%' as "ilike"
FROM test;
logical_plan
01)Projection: test.column1_utf8view LIKE Utf8View("%foo%") AS like,
test.column1_utf8view ILIKE Utf8View("%foo%") AS ilike
02)--TableScan: test projection=[column1_utf8view]
```
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
