Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]

2024-11-08 Thread via GitHub


alamb merged PR #13260:
URL: https://github.com/apache/datafusion/pull/13260


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]

2024-11-08 Thread via GitHub


alamb commented on PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#issuecomment-2465520257

   Thanks @findepi @adriangb and @goldmedal for this PR and the reviews


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]

2024-11-08 Thread via GitHub


findepi commented on PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#issuecomment-2465026990

   Applied comments and fixed handling of implicit escape. 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]

2024-11-08 Thread via GitHub


findepi commented on code in PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#discussion_r1834565613


##
datafusion/sqllogictest/test_files/string/string_view.slt:
##
@@ -396,8 +396,9 @@ EXPLAIN SELECT
 FROM test;
 
 logical_plan
-01)Projection: test.column1_utf8view LIKE Utf8View("foo") AS like, 
test.column1_utf8view ILIKE Utf8View("foo") AS ilike
-02)--TableScan: test projection=[column1_utf8view]
+01)Projection: __common_expr_1 AS like, __common_expr_1 AS ilike

Review Comment:
   Done



##
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##
@@ -3633,32 +3697,123 @@ mod tests {
 
 #[test]
 fn test_like_and_ilke() {
-// LIKE '%'
-let expr = like(col("c1"), "%");
+let null = lit(ScalarValue::Utf8(None));
+
+// expr [NOT] [I]LIKE NULL
+let expr = like(col("c1"), null.clone());

Review Comment:
   Let's follow-up



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]

2024-11-08 Thread via GitHub


findepi commented on PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#issuecomment-2464972049

   (just rebased)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]

2024-11-08 Thread via GitHub


findepi commented on PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#issuecomment-2464969396

   The conflicts are because https://github.com/apache/datafusion/pull/13288 is 
now merged. Let me rebase.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]

2024-11-07 Thread via GitHub


alamb commented on PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#issuecomment-2462907348

   Looks like this branch has some conflicts now to resolve but then I think it 
will be ready to god


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]

2024-11-07 Thread via GitHub


goldmedal commented on code in PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#discussion_r1832707088


##
datafusion/sqllogictest/test_files/string/string_query.slt.part:
##
@@ -873,41 +873,30 @@ NULL NULL NULL NULL NULL
 #Raphael datafusionДатаФусион false false false false
 #NULL NULL NULL NULL NULL NULL
 
-# TODO (https://github.com/apache/datafusion/issues/12637) uncomment 
additional test projections
-query TTBB
+query TT
 SELECT ascii_1, unicode_1,
   ascii_1 LIKE '%' AS ascii_1_like_percent,
-  unicode_1 LIKE '%' AS unicode_1_like_percent
-  -- ascii_1 LIKE '%%' AS ascii_1_like_percent_percent, -- TODO enable 
after fixing https://github.com/apache/datafusion/issues/12637
-  -- unicode_1 LIKE '%%' AS unicode_1_like_percent_percent -- TODO enable 
after fixing https://github.com/apache/datafusion/issues/12637
+  unicode_1 LIKE '%' AS unicode_1_like_percent,
+  ascii_1 LIKE '%%' AS ascii_1_like_percent_percent,
+  unicode_1 LIKE '%%' AS unicode_1_like_percent_percent
 FROM test_basic_operator
 
-Andrew datafusion📊🔥 true true
-Xiangpeng datafusion数据融合 true true
-Raphael datafusionДатаФусион true true
-under_score un iść core true true
-percent pan Tadeusz ma iść w kąt true true
-(empty) (empty) true true
-NULL NULL NULL NULL
-NULL NULL NULL NULL
+Andrew datafusion📊🔥 true true true true
+Xiangpeng datafusion数据融合 true true true true
+Raphael datafusionДатаФусион true true true true
+under_score un iść core true true true true
+percent pan Tadeusz ma iść w kąt true true true true
+(empty) (empty) true true true true
+NULL NULL NULL NULL NULL NULL
+NULL NULL NULL NULL NULL NULL
 
-# TODO (https://github.com/apache/datafusion/issues/12637) uncomment 
additional test projections
-query TTBB
+query error DataFusion error: SQL error: ParserError\("Expected an expression, 
found: FROM"\)
 SELECT ascii_1, unicode_1,
   ascii_1 NOT LIKE '%' AS ascii_1_not_like_percent,
-  unicode_1 NOT LIKE '%' AS unicode_1_not_like_percent
-  -- ascii_1 NOT LIKE '%%' AS ascii_1_not_like_percent_percent, -- TODO 
enable after fixing https://github.com/apache/datafusion/issues/12637
-  -- unicode_1 NOT LIKE '%%' AS unicode_1_not_like_percent_percent -- TODO 
enable after fixing https://github.com/apache/datafusion/issues/12637
+  unicode_1 NOT LIKE '%' AS unicode_1_not_like_percent,
+  ascii_1 NOT LIKE '%%' AS ascii_1_not_like_percent_percent,
+  unicode_1 NOT LIKE '%%' AS unicode_1_not_like_percent_percent,

Review Comment:
   ```suggestion
 unicode_1 NOT LIKE '%%' AS unicode_1_not_like_percent_percent
   ```
   A redundant comma causes the parsing error.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]

2024-11-07 Thread via GitHub


findepi commented on code in PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#discussion_r1832420550


##
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##
@@ -3633,32 +3697,123 @@ mod tests {
 
 #[test]
 fn test_like_and_ilke() {
-// LIKE '%'
-let expr = like(col("c1"), "%");
+let null = lit(ScalarValue::Utf8(None));
+
+// expr [NOT] [I]LIKE NULL
+let expr = like(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = not_like(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = ilike(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = not_ilike(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+// expr [NOT] [I]LIKE '%'
+let expr = like(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), true));
+
+let expr = not_like(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), false));
+
+let expr = ilike(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), true));
+
+let expr = not_ilike(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), false));
+
+// expr [NOT] [I]LIKE '%%'

Review Comment:
   will temporarily rebase on top of 
https://github.com/apache/datafusion/pull/13288 for more test coverage



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]

2024-11-07 Thread via GitHub


findepi commented on PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#issuecomment-2461793835

   FYI, when testing LIKE patterns with potential escapes, beware of difference 
between sqllogictests and CLI when it comes to backslash (like implicit escape) 
-- https://github.com/apache/datafusion/issues/13286


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]

2024-11-07 Thread via GitHub


findepi commented on code in PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#discussion_r1832332042


##
datafusion/sqllogictest/test_files/string/string_query.slt.part:
##
@@ -873,41 +873,30 @@ NULL NULL NULL NULL NULL
 #Raphael datafusionДатаФусион false false false false
 #NULL NULL NULL NULL NULL NULL
 
-# TODO (https://github.com/apache/datafusion/issues/12637) uncomment 
additional test projections
-query TTBB
+query TT
 SELECT ascii_1, unicode_1,
   ascii_1 LIKE '%' AS ascii_1_like_percent,
-  unicode_1 LIKE '%' AS unicode_1_like_percent
-  -- ascii_1 LIKE '%%' AS ascii_1_like_percent_percent, -- TODO enable 
after fixing https://github.com/apache/datafusion/issues/12637
-  -- unicode_1 LIKE '%%' AS unicode_1_like_percent_percent -- TODO enable 
after fixing https://github.com/apache/datafusion/issues/12637
+  unicode_1 LIKE '%' AS unicode_1_like_percent,
+  ascii_1 LIKE '%%' AS ascii_1_like_percent_percent,
+  unicode_1 LIKE '%%' AS unicode_1_like_percent_percent
 FROM test_basic_operator
 
-Andrew datafusion📊🔥 true true
-Xiangpeng datafusion数据融合 true true
-Raphael datafusionДатаФусион true true
-under_score un iść core true true
-percent pan Tadeusz ma iść w kąt true true
-(empty) (empty) true true
-NULL NULL NULL NULL
-NULL NULL NULL NULL
+Andrew datafusion📊🔥 true true true true
+Xiangpeng datafusion数据融合 true true true true
+Raphael datafusionДатаФусион true true true true
+under_score un iść core true true true true
+percent pan Tadeusz ma iść w kąt true true true true
+(empty) (empty) true true true true
+NULL NULL NULL NULL NULL NULL
+NULL NULL NULL NULL NULL NULL
 
-# TODO (https://github.com/apache/datafusion/issues/12637) uncomment 
additional test projections
-query TTBB
+query error DataFusion error: SQL error: ParserError\("Expected an expression, 
found: FROM"\)

Review Comment:
   bug in the test query



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]

2024-11-07 Thread via GitHub


findepi commented on code in PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#discussion_r1832239184


##
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##
@@ -3633,32 +3697,123 @@ mod tests {
 
 #[test]
 fn test_like_and_ilke() {
-// LIKE '%'
-let expr = like(col("c1"), "%");
+let null = lit(ScalarValue::Utf8(None));
+
+// expr [NOT] [I]LIKE NULL
+let expr = like(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = not_like(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = ilike(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = not_ilike(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+// expr [NOT] [I]LIKE '%'
+let expr = like(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), true));
+
+let expr = not_like(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), false));
+
+let expr = ilike(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), true));
+
+let expr = not_ilike(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), false));
+
+// expr [NOT] [I]LIKE '%%'

Review Comment:
   The `%%` simplification is wrong. `\` is implicit escape. Let me update.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]

2024-11-06 Thread via GitHub


findepi commented on code in PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#discussion_r1831797615


##
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##
@@ -3633,32 +3697,123 @@ mod tests {
 
 #[test]
 fn test_like_and_ilke() {
-// LIKE '%'
-let expr = like(col("c1"), "%");
+let null = lit(ScalarValue::Utf8(None));
+
+// expr [NOT] [I]LIKE NULL
+let expr = like(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = not_like(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = ilike(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = not_ilike(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+// expr [NOT] [I]LIKE '%'
+let expr = like(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), true));
+
+let expr = not_like(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), false));
+
+let expr = ilike(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), true));
+
+let expr = not_ilike(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), false));
+
+// expr [NOT] [I]LIKE '%%'

Review Comment:
   > In Postgres, `%%` doesn't mean matching the actual literal '%'. The escape 
character is `\`.
   
   thanks @goldmedal for checking this.
   
   
   in our case, we don't have default escape character at all, i believe.
   at least we don't define it on Like object.
   
   i need to check whether we have an _implicit_ default escape character 
applied somewhere later.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]

2024-11-06 Thread via GitHub


alamb commented on code in PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#discussion_r1831473963


##
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##
@@ -1470,34 +1471,67 @@ impl<'a, S: SimplifyInfo> TreeNodeRewriter for 
Simplifier<'a, S> {
 }) => Transformed::yes(simplify_regex_expr(left, op, right)?),
 
 // Rules for Like
-Expr::Like(Like {
-expr,
-pattern,
-negated,
-escape_char: _,
-case_insensitive: _,
-}) if matches!(
-pattern.as_ref(),
-Expr::Literal(ScalarValue::Utf8(Some(pattern_str))) if 
pattern_str == "%"
-) || matches!(
-pattern.as_ref(),
-Expr::Literal(ScalarValue::LargeUtf8(Some(pattern_str))) if 
pattern_str == "%"
-) || matches!(
-pattern.as_ref(),
-Expr::Literal(ScalarValue::Utf8View(Some(pattern_str))) if 
pattern_str == "%"
-) =>
-{
-// exp LIKE '%' is
-//   - when exp is not NULL, it's true
-//   - when exp is NULL, it's NULL
-// exp NOT LIKE '%' is
-//   - when exp is not NULL, it's false
-//   - when exp is NULL, it's NULL
-Transformed::yes(Expr::Case(Case {
-expr: Some(Box::new(Expr::IsNotNull(expr))),
-when_then_expr: vec![(Box::new(lit(true)), 
Box::new(lit(!negated)))],
-else_expr: None,
-}))
+Expr::Like(like) => {
+match as_string_scalar(&like.pattern) {
+Some((data_type, pattern_str)) => {
+match pattern_str {
+None => return 
Ok(Transformed::yes(lit_bool_null())),
+Some(pattern_str) if pattern_str == "%" => {
+// exp LIKE '%' is
+//   - when exp is not NULL, it's true
+//   - when exp is NULL, it's NULL
+// exp NOT LIKE '%' is
+//   - when exp is not NULL, it's false
+//   - when exp is NULL, it's NULL
+let result_for_non_null = lit(!like.negated);
+Transformed::yes(if 
!info.nullable(&like.expr)? {
+result_for_non_null
+} else {
+Expr::Case(Case {
+expr: 
Some(Box::new(Expr::IsNotNull(like.expr))),
+when_then_expr: vec![(
+Box::new(lit(true)),
+Box::new(result_for_non_null),
+)],
+else_expr: None,
+})
+})
+}
+Some(pattern_str)

Review Comment:
   
   ```suggestion
   Some(pattern_str)
   // Repeated occurrences of wildcard are 
redundant so remove them
   // exp LIKE '%%'  --> exp LIKE '%'
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]

2024-11-06 Thread via GitHub


alamb commented on code in PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#discussion_r1831471559


##
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##
@@ -3633,32 +3697,123 @@ mod tests {
 
 #[test]
 fn test_like_and_ilke() {
-// LIKE '%'
-let expr = like(col("c1"), "%");
+let null = lit(ScalarValue::Utf8(None));
+
+// expr [NOT] [I]LIKE NULL
+let expr = like(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = not_like(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = ilike(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = not_ilike(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+// expr [NOT] [I]LIKE '%'
+let expr = like(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), true));
+
+let expr = not_like(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), false));
+
+let expr = ilike(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), true));
+
+let expr = not_ilike(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), false));
+
+// expr [NOT] [I]LIKE '%%'

Review Comment:
   I was definitely confused -- thank you



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]

2024-11-06 Thread via GitHub


crepererum commented on PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#issuecomment-2460111460

   since there are already two reviewers on it, I'll gonna skip this PR. 
However if you need my input, ping me.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]

2024-11-06 Thread via GitHub


goldmedal commented on code in PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#discussion_r1830993512


##
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##
@@ -3633,32 +3697,123 @@ mod tests {
 
 #[test]
 fn test_like_and_ilke() {
-// LIKE '%'
-let expr = like(col("c1"), "%");
+let null = lit(ScalarValue::Utf8(None));
+
+// expr [NOT] [I]LIKE NULL
+let expr = like(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = not_like(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = ilike(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = not_ilike(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+// expr [NOT] [I]LIKE '%'
+let expr = like(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), true));
+
+let expr = not_like(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), false));
+
+let expr = ilike(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), true));
+
+let expr = not_ilike(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), false));
+
+// expr [NOT] [I]LIKE '%%'

Review Comment:
   > I thought `expr LIKE '%%'` means the same as `expr = '%'` (match the 
actual literal character `%` )
   > 
   > Specifically that `%%` is how to escape the wildcard to not be a wildcard
   
   I think it depends on how we define it. I assume we prefer to follow the 
Postgres behavior. 🤔 
   In Postgres, `%%` doesn't mean matching the actual literal '%'. The escape 
character is `\`.
   ```
   test=# select '1' like '%%';
?column? 
   --
t
   (1 row)
   
   test=# select '%' like '%%';
?column? 
   --
t
   (1 row)
   
   test=# select '%' like '\%';
?column? 
   --
t
   (1 row)
   
   test=# select '1' like '\%';
?column? 
   --
f
   (1 row)
   ```
   
   I think it is just a redundant wildcard. So, we can simplify it to be only 
one `%`.
   
   
   By the way, in DuckDB, there is a similar behavior for double '%%'.
   ```
   D select '1' like '%%';
   ┌───┐
   │ ('1' ~~ '%%') │
   │boolean│
   ├───┤
   │ true  │
   └───┘
   D select '%' like '%%';
   ┌───┐
   │ ('%' ~~ '%%') │
   │boolean│
   ├───┤
   │ true  │
   └───┘
   
   However, the escape character should be set by `ESCAPE` syntax
   ```
   D select '%' like '\%';
   ┌───┐
   │ ('%' ~~ '\%') │
   │boolean│
   ├───┤
   │ false │
   └───┘
   D select '%' like '\%' escape '\';
   ┌──┐
   │ main.like_escape('%', '\%', '\') │
   │ boolean  │
   ├──┤
   │ true │
   └──┘
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]

2024-11-06 Thread via GitHub


goldmedal commented on code in PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#discussion_r1830993512


##
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##
@@ -3633,32 +3697,123 @@ mod tests {
 
 #[test]
 fn test_like_and_ilke() {
-// LIKE '%'
-let expr = like(col("c1"), "%");
+let null = lit(ScalarValue::Utf8(None));
+
+// expr [NOT] [I]LIKE NULL
+let expr = like(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = not_like(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = ilike(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = not_ilike(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+// expr [NOT] [I]LIKE '%'
+let expr = like(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), true));
+
+let expr = not_like(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), false));
+
+let expr = ilike(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), true));
+
+let expr = not_ilike(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), false));
+
+// expr [NOT] [I]LIKE '%%'

Review Comment:
   > I thought `expr LIKE '%%'` means the same as `expr = '%'` (match the 
actual literal character `%` )
   > 
   > Specifically that `%%` is how to escape the wildcard to not be a wildcard
   
   I think it depends on how we define it. I assume we prefer to follow the 
Postgres behavior. 🤔 
   In Postgres, `%%` doesn't mean matching the actual literal '%'. In Postgres, 
the escape character is `\`.
   ```
   test=# select '1' like '%%';
?column? 
   --
t
   (1 row)
   
   test=# select '%' like '%%';
?column? 
   --
t
   (1 row)
   
   test=# select '%' like '\%';
?column? 
   --
t
   (1 row)
   
   test=# select '1' like '\%';
?column? 
   --
f
   (1 row)
   ```
   
   I think it is just a redundant wildcard. So, we can simplify it to be only 
one `%`.
   
   
   By the way, in DuckDB, there is a similar behavior for double '%%'.
   ```
   D select '1' like '%%';
   ┌───┐
   │ ('1' ~~ '%%') │
   │boolean│
   ├───┤
   │ true  │
   └───┘
   D select '%' like '%%';
   ┌───┐
   │ ('%' ~~ '%%') │
   │boolean│
   ├───┤
   │ true  │
   └───┘
   
   However, the escape character should be set by `ESCAPE` syntax
   ```
   D select '%' like '\%';
   ┌───┐
   │ ('%' ~~ '\%') │
   │boolean│
   ├───┤
   │ false │
   └───┘
   D select '%' like '\%' escape '\';
   ┌──┐
   │ main.like_escape('%', '\%', '\') │
   │ boolean  │
   ├──┤
   │ true │
   └──┘
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]

2024-11-06 Thread via GitHub


goldmedal commented on code in PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#discussion_r1830993512


##
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##
@@ -3633,32 +3697,123 @@ mod tests {
 
 #[test]
 fn test_like_and_ilke() {
-// LIKE '%'
-let expr = like(col("c1"), "%");
+let null = lit(ScalarValue::Utf8(None));
+
+// expr [NOT] [I]LIKE NULL
+let expr = like(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = not_like(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = ilike(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = not_ilike(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+// expr [NOT] [I]LIKE '%'
+let expr = like(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), true));
+
+let expr = not_like(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), false));
+
+let expr = ilike(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), true));
+
+let expr = not_ilike(col("c1"), lit("%"));
+assert_eq!(simplify(expr), if_not_null(col("c1"), false));
+
+// expr [NOT] [I]LIKE '%%'

Review Comment:
   > I thought `expr LIKE '%%'` means the same as `expr = '%'` (match the 
actual literal character `%` )
   > 
   > Specifically that `%%` is how to escape the wildcard to not be a wildcard
   
   I think it depends on how we define it. I assume we prefer to follow the 
Postgres behavior. 🤔 
   In Postgres, `%%` doesn't mean matching the actual literal '%'. The escape 
character is `\`.
   ```
   test=# select '1' like '%%';
?column? 
   --
t
   (1 row)
   
   test=# select '%' like '%%';
?column? 
   --
t
   (1 row)
   
   test=# select '%' like '\%';
?column? 
   --
t
   (1 row)
   
   test=# select '1' like '\%';
?column? 
   --
f
   (1 row)
   ```
   
   I think `%%` is just a redundant wildcard. So, we can simplify it to be only 
one `%`.
   
   
   By the way, in DuckDB, there is a similar behavior for double '%%'.
   ```
   D select '1' like '%%';
   ┌───┐
   │ ('1' ~~ '%%') │
   │boolean│
   ├───┤
   │ true  │
   └───┘
   D select '%' like '%%';
   ┌───┐
   │ ('%' ~~ '%%') │
   │boolean│
   ├───┤
   │ true  │
   └───┘
   
   However, the escape character should be set by `ESCAPE` syntax
   ```
   D select '%' like '\%';
   ┌───┐
   │ ('%' ~~ '\%') │
   │boolean│
   ├───┤
   │ false │
   └───┘
   D select '%' like '\%' escape '\';
   ┌──┐
   │ main.like_escape('%', '\%', '\') │
   │ boolean  │
   ├──┤
   │ true │
   └──┘
   ```
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org



Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]

2024-11-06 Thread via GitHub


goldmedal commented on code in PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#discussion_r1831061785


##
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##
@@ -3633,32 +3697,123 @@ mod tests {
 
 #[test]
 fn test_like_and_ilke() {
-// LIKE '%'
-let expr = like(col("c1"), "%");
+let null = lit(ScalarValue::Utf8(None));
+
+// expr [NOT] [I]LIKE NULL
+let expr = like(col("c1"), null.clone());

Review Comment:
   ```suggestion
   let expr = col("c1").like(null.clone());
   ```
   As @alamb's comment 
(https://github.com/apache/datafusion/pull/13260#discussion_r1830873028), we 
can do some refactor like this.



##
datafusion/sqllogictest/test_files/string/string_view.slt:
##
@@ -396,8 +396,9 @@ EXPLAIN SELECT
 FROM test;
 
 logical_plan
-01)Projection: test.column1_utf8view LIKE Utf8View("foo") AS like, 
test.column1_utf8view ILIKE Utf8View("foo") AS ilike
-02)--TableScan: test projection=[column1_utf8view]
+01)Projection: __common_expr_1 AS like, __common_expr_1 AS ilike

Review Comment:
   Agreed. I did some tests. I think they still use the native StringView 
implementation for pattern matching.
   ```
   query TT
   EXPLAIN SELECT
 column1_utf8view like '%foo%' as "like",
 column1_utf8view ilike '%foo%' as "ilike"
   FROM test;
   
   logical_plan
   01)Projection: test.column1_utf8view LIKE Utf8View("%foo%") AS like, 
test.column1_utf8view ILIKE Utf8View("%foo%") AS ilike
   02)--TableScan: test projection=[column1_utf8view]
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


-
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org