Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
alamb merged PR #13260: URL: https://github.com/apache/datafusion/pull/13260 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
alamb commented on PR #13260: URL: https://github.com/apache/datafusion/pull/13260#issuecomment-2465520257 Thanks @findepi @adriangb and @goldmedal for this PR and the reviews -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
findepi commented on PR #13260: URL: https://github.com/apache/datafusion/pull/13260#issuecomment-2465026990 Applied comments and fixed handling of implicit escape. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
findepi commented on code in PR #13260: URL: https://github.com/apache/datafusion/pull/13260#discussion_r1834565613 ## datafusion/sqllogictest/test_files/string/string_view.slt: ## @@ -396,8 +396,9 @@ EXPLAIN SELECT FROM test; logical_plan -01)Projection: test.column1_utf8view LIKE Utf8View("foo") AS like, test.column1_utf8view ILIKE Utf8View("foo") AS ilike -02)--TableScan: test projection=[column1_utf8view] +01)Projection: __common_expr_1 AS like, __common_expr_1 AS ilike Review Comment: Done ## datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs: ## @@ -3633,32 +3697,123 @@ mod tests { #[test] fn test_like_and_ilke() { -// LIKE '%' -let expr = like(col("c1"), "%"); +let null = lit(ScalarValue::Utf8(None)); + +// expr [NOT] [I]LIKE NULL +let expr = like(col("c1"), null.clone()); Review Comment: Let's follow-up -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
findepi commented on PR #13260: URL: https://github.com/apache/datafusion/pull/13260#issuecomment-2464972049 (just rebased) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
findepi commented on PR #13260: URL: https://github.com/apache/datafusion/pull/13260#issuecomment-2464969396 The conflicts are because https://github.com/apache/datafusion/pull/13288 is now merged. Let me rebase. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
alamb commented on PR #13260: URL: https://github.com/apache/datafusion/pull/13260#issuecomment-2462907348 Looks like this branch has some conflicts now to resolve but then I think it will be ready to god -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
goldmedal commented on code in PR #13260: URL: https://github.com/apache/datafusion/pull/13260#discussion_r1832707088 ## datafusion/sqllogictest/test_files/string/string_query.slt.part: ## @@ -873,41 +873,30 @@ NULL NULL NULL NULL NULL #Raphael datafusionДатаФусион false false false false #NULL NULL NULL NULL NULL NULL -# TODO (https://github.com/apache/datafusion/issues/12637) uncomment additional test projections -query TTBB +query TT SELECT ascii_1, unicode_1, ascii_1 LIKE '%' AS ascii_1_like_percent, - unicode_1 LIKE '%' AS unicode_1_like_percent - -- ascii_1 LIKE '%%' AS ascii_1_like_percent_percent, -- TODO enable after fixing https://github.com/apache/datafusion/issues/12637 - -- unicode_1 LIKE '%%' AS unicode_1_like_percent_percent -- TODO enable after fixing https://github.com/apache/datafusion/issues/12637 + unicode_1 LIKE '%' AS unicode_1_like_percent, + ascii_1 LIKE '%%' AS ascii_1_like_percent_percent, + unicode_1 LIKE '%%' AS unicode_1_like_percent_percent FROM test_basic_operator -Andrew datafusion📊🔥 true true -Xiangpeng datafusion数据融合 true true -Raphael datafusionДатаФусион true true -under_score un iść core true true -percent pan Tadeusz ma iść w kąt true true -(empty) (empty) true true -NULL NULL NULL NULL -NULL NULL NULL NULL +Andrew datafusion📊🔥 true true true true +Xiangpeng datafusion数据融合 true true true true +Raphael datafusionДатаФусион true true true true +under_score un iść core true true true true +percent pan Tadeusz ma iść w kąt true true true true +(empty) (empty) true true true true +NULL NULL NULL NULL NULL NULL +NULL NULL NULL NULL NULL NULL -# TODO (https://github.com/apache/datafusion/issues/12637) uncomment additional test projections -query TTBB +query error DataFusion error: SQL error: ParserError\("Expected an expression, found: FROM"\) SELECT ascii_1, unicode_1, ascii_1 NOT LIKE '%' AS ascii_1_not_like_percent, - unicode_1 NOT LIKE '%' AS unicode_1_not_like_percent - -- ascii_1 NOT LIKE '%%' AS ascii_1_not_like_percent_percent, -- TODO enable after fixing https://github.com/apache/datafusion/issues/12637 - -- unicode_1 NOT LIKE '%%' AS unicode_1_not_like_percent_percent -- TODO enable after fixing https://github.com/apache/datafusion/issues/12637 + unicode_1 NOT LIKE '%' AS unicode_1_not_like_percent, + ascii_1 NOT LIKE '%%' AS ascii_1_not_like_percent_percent, + unicode_1 NOT LIKE '%%' AS unicode_1_not_like_percent_percent, Review Comment: ```suggestion unicode_1 NOT LIKE '%%' AS unicode_1_not_like_percent_percent ``` A redundant comma causes the parsing error. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
findepi commented on code in PR #13260: URL: https://github.com/apache/datafusion/pull/13260#discussion_r1832420550 ## datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs: ## @@ -3633,32 +3697,123 @@ mod tests { #[test] fn test_like_and_ilke() { -// LIKE '%' -let expr = like(col("c1"), "%"); +let null = lit(ScalarValue::Utf8(None)); + +// expr [NOT] [I]LIKE NULL +let expr = like(col("c1"), null.clone()); +assert_eq!(simplify(expr), lit_bool_null()); + +let expr = not_like(col("c1"), null.clone()); +assert_eq!(simplify(expr), lit_bool_null()); + +let expr = ilike(col("c1"), null.clone()); +assert_eq!(simplify(expr), lit_bool_null()); + +let expr = not_ilike(col("c1"), null.clone()); +assert_eq!(simplify(expr), lit_bool_null()); + +// expr [NOT] [I]LIKE '%' +let expr = like(col("c1"), lit("%")); +assert_eq!(simplify(expr), if_not_null(col("c1"), true)); + +let expr = not_like(col("c1"), lit("%")); +assert_eq!(simplify(expr), if_not_null(col("c1"), false)); + +let expr = ilike(col("c1"), lit("%")); +assert_eq!(simplify(expr), if_not_null(col("c1"), true)); + +let expr = not_ilike(col("c1"), lit("%")); +assert_eq!(simplify(expr), if_not_null(col("c1"), false)); + +// expr [NOT] [I]LIKE '%%' Review Comment: will temporarily rebase on top of https://github.com/apache/datafusion/pull/13288 for more test coverage -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
findepi commented on PR #13260: URL: https://github.com/apache/datafusion/pull/13260#issuecomment-2461793835 FYI, when testing LIKE patterns with potential escapes, beware of difference between sqllogictests and CLI when it comes to backslash (like implicit escape) -- https://github.com/apache/datafusion/issues/13286 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
findepi commented on code in PR #13260: URL: https://github.com/apache/datafusion/pull/13260#discussion_r1832332042 ## datafusion/sqllogictest/test_files/string/string_query.slt.part: ## @@ -873,41 +873,30 @@ NULL NULL NULL NULL NULL #Raphael datafusionДатаФусион false false false false #NULL NULL NULL NULL NULL NULL -# TODO (https://github.com/apache/datafusion/issues/12637) uncomment additional test projections -query TTBB +query TT SELECT ascii_1, unicode_1, ascii_1 LIKE '%' AS ascii_1_like_percent, - unicode_1 LIKE '%' AS unicode_1_like_percent - -- ascii_1 LIKE '%%' AS ascii_1_like_percent_percent, -- TODO enable after fixing https://github.com/apache/datafusion/issues/12637 - -- unicode_1 LIKE '%%' AS unicode_1_like_percent_percent -- TODO enable after fixing https://github.com/apache/datafusion/issues/12637 + unicode_1 LIKE '%' AS unicode_1_like_percent, + ascii_1 LIKE '%%' AS ascii_1_like_percent_percent, + unicode_1 LIKE '%%' AS unicode_1_like_percent_percent FROM test_basic_operator -Andrew datafusion📊🔥 true true -Xiangpeng datafusion数据融合 true true -Raphael datafusionДатаФусион true true -under_score un iść core true true -percent pan Tadeusz ma iść w kąt true true -(empty) (empty) true true -NULL NULL NULL NULL -NULL NULL NULL NULL +Andrew datafusion📊🔥 true true true true +Xiangpeng datafusion数据融合 true true true true +Raphael datafusionДатаФусион true true true true +under_score un iść core true true true true +percent pan Tadeusz ma iść w kąt true true true true +(empty) (empty) true true true true +NULL NULL NULL NULL NULL NULL +NULL NULL NULL NULL NULL NULL -# TODO (https://github.com/apache/datafusion/issues/12637) uncomment additional test projections -query TTBB +query error DataFusion error: SQL error: ParserError\("Expected an expression, found: FROM"\) Review Comment: bug in the test query -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
findepi commented on code in PR #13260: URL: https://github.com/apache/datafusion/pull/13260#discussion_r1832239184 ## datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs: ## @@ -3633,32 +3697,123 @@ mod tests { #[test] fn test_like_and_ilke() { -// LIKE '%' -let expr = like(col("c1"), "%"); +let null = lit(ScalarValue::Utf8(None)); + +// expr [NOT] [I]LIKE NULL +let expr = like(col("c1"), null.clone()); +assert_eq!(simplify(expr), lit_bool_null()); + +let expr = not_like(col("c1"), null.clone()); +assert_eq!(simplify(expr), lit_bool_null()); + +let expr = ilike(col("c1"), null.clone()); +assert_eq!(simplify(expr), lit_bool_null()); + +let expr = not_ilike(col("c1"), null.clone()); +assert_eq!(simplify(expr), lit_bool_null()); + +// expr [NOT] [I]LIKE '%' +let expr = like(col("c1"), lit("%")); +assert_eq!(simplify(expr), if_not_null(col("c1"), true)); + +let expr = not_like(col("c1"), lit("%")); +assert_eq!(simplify(expr), if_not_null(col("c1"), false)); + +let expr = ilike(col("c1"), lit("%")); +assert_eq!(simplify(expr), if_not_null(col("c1"), true)); + +let expr = not_ilike(col("c1"), lit("%")); +assert_eq!(simplify(expr), if_not_null(col("c1"), false)); + +// expr [NOT] [I]LIKE '%%' Review Comment: The `%%` simplification is wrong. `\` is implicit escape. Let me update. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
findepi commented on code in PR #13260: URL: https://github.com/apache/datafusion/pull/13260#discussion_r1831797615 ## datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs: ## @@ -3633,32 +3697,123 @@ mod tests { #[test] fn test_like_and_ilke() { -// LIKE '%' -let expr = like(col("c1"), "%"); +let null = lit(ScalarValue::Utf8(None)); + +// expr [NOT] [I]LIKE NULL +let expr = like(col("c1"), null.clone()); +assert_eq!(simplify(expr), lit_bool_null()); + +let expr = not_like(col("c1"), null.clone()); +assert_eq!(simplify(expr), lit_bool_null()); + +let expr = ilike(col("c1"), null.clone()); +assert_eq!(simplify(expr), lit_bool_null()); + +let expr = not_ilike(col("c1"), null.clone()); +assert_eq!(simplify(expr), lit_bool_null()); + +// expr [NOT] [I]LIKE '%' +let expr = like(col("c1"), lit("%")); +assert_eq!(simplify(expr), if_not_null(col("c1"), true)); + +let expr = not_like(col("c1"), lit("%")); +assert_eq!(simplify(expr), if_not_null(col("c1"), false)); + +let expr = ilike(col("c1"), lit("%")); +assert_eq!(simplify(expr), if_not_null(col("c1"), true)); + +let expr = not_ilike(col("c1"), lit("%")); +assert_eq!(simplify(expr), if_not_null(col("c1"), false)); + +// expr [NOT] [I]LIKE '%%' Review Comment: > In Postgres, `%%` doesn't mean matching the actual literal '%'. The escape character is `\`. thanks @goldmedal for checking this. in our case, we don't have default escape character at all, i believe. at least we don't define it on Like object. i need to check whether we have an _implicit_ default escape character applied somewhere later. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
alamb commented on code in PR #13260: URL: https://github.com/apache/datafusion/pull/13260#discussion_r1831473963 ## datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs: ## @@ -1470,34 +1471,67 @@ impl<'a, S: SimplifyInfo> TreeNodeRewriter for Simplifier<'a, S> { }) => Transformed::yes(simplify_regex_expr(left, op, right)?), // Rules for Like -Expr::Like(Like { -expr, -pattern, -negated, -escape_char: _, -case_insensitive: _, -}) if matches!( -pattern.as_ref(), -Expr::Literal(ScalarValue::Utf8(Some(pattern_str))) if pattern_str == "%" -) || matches!( -pattern.as_ref(), -Expr::Literal(ScalarValue::LargeUtf8(Some(pattern_str))) if pattern_str == "%" -) || matches!( -pattern.as_ref(), -Expr::Literal(ScalarValue::Utf8View(Some(pattern_str))) if pattern_str == "%" -) => -{ -// exp LIKE '%' is -// - when exp is not NULL, it's true -// - when exp is NULL, it's NULL -// exp NOT LIKE '%' is -// - when exp is not NULL, it's false -// - when exp is NULL, it's NULL -Transformed::yes(Expr::Case(Case { -expr: Some(Box::new(Expr::IsNotNull(expr))), -when_then_expr: vec![(Box::new(lit(true)), Box::new(lit(!negated)))], -else_expr: None, -})) +Expr::Like(like) => { +match as_string_scalar(&like.pattern) { +Some((data_type, pattern_str)) => { +match pattern_str { +None => return Ok(Transformed::yes(lit_bool_null())), +Some(pattern_str) if pattern_str == "%" => { +// exp LIKE '%' is +// - when exp is not NULL, it's true +// - when exp is NULL, it's NULL +// exp NOT LIKE '%' is +// - when exp is not NULL, it's false +// - when exp is NULL, it's NULL +let result_for_non_null = lit(!like.negated); +Transformed::yes(if !info.nullable(&like.expr)? { +result_for_non_null +} else { +Expr::Case(Case { +expr: Some(Box::new(Expr::IsNotNull(like.expr))), +when_then_expr: vec![( +Box::new(lit(true)), +Box::new(result_for_non_null), +)], +else_expr: None, +}) +}) +} +Some(pattern_str) Review Comment: ```suggestion Some(pattern_str) // Repeated occurrences of wildcard are redundant so remove them // exp LIKE '%%' --> exp LIKE '%' ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
alamb commented on code in PR #13260: URL: https://github.com/apache/datafusion/pull/13260#discussion_r1831471559 ## datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs: ## @@ -3633,32 +3697,123 @@ mod tests { #[test] fn test_like_and_ilke() { -// LIKE '%' -let expr = like(col("c1"), "%"); +let null = lit(ScalarValue::Utf8(None)); + +// expr [NOT] [I]LIKE NULL +let expr = like(col("c1"), null.clone()); +assert_eq!(simplify(expr), lit_bool_null()); + +let expr = not_like(col("c1"), null.clone()); +assert_eq!(simplify(expr), lit_bool_null()); + +let expr = ilike(col("c1"), null.clone()); +assert_eq!(simplify(expr), lit_bool_null()); + +let expr = not_ilike(col("c1"), null.clone()); +assert_eq!(simplify(expr), lit_bool_null()); + +// expr [NOT] [I]LIKE '%' +let expr = like(col("c1"), lit("%")); +assert_eq!(simplify(expr), if_not_null(col("c1"), true)); + +let expr = not_like(col("c1"), lit("%")); +assert_eq!(simplify(expr), if_not_null(col("c1"), false)); + +let expr = ilike(col("c1"), lit("%")); +assert_eq!(simplify(expr), if_not_null(col("c1"), true)); + +let expr = not_ilike(col("c1"), lit("%")); +assert_eq!(simplify(expr), if_not_null(col("c1"), false)); + +// expr [NOT] [I]LIKE '%%' Review Comment: I was definitely confused -- thank you -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
crepererum commented on PR #13260: URL: https://github.com/apache/datafusion/pull/13260#issuecomment-2460111460 since there are already two reviewers on it, I'll gonna skip this PR. However if you need my input, ping me. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
goldmedal commented on code in PR #13260: URL: https://github.com/apache/datafusion/pull/13260#discussion_r1830993512 ## datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs: ## @@ -3633,32 +3697,123 @@ mod tests { #[test] fn test_like_and_ilke() { -// LIKE '%' -let expr = like(col("c1"), "%"); +let null = lit(ScalarValue::Utf8(None)); + +// expr [NOT] [I]LIKE NULL +let expr = like(col("c1"), null.clone()); +assert_eq!(simplify(expr), lit_bool_null()); + +let expr = not_like(col("c1"), null.clone()); +assert_eq!(simplify(expr), lit_bool_null()); + +let expr = ilike(col("c1"), null.clone()); +assert_eq!(simplify(expr), lit_bool_null()); + +let expr = not_ilike(col("c1"), null.clone()); +assert_eq!(simplify(expr), lit_bool_null()); + +// expr [NOT] [I]LIKE '%' +let expr = like(col("c1"), lit("%")); +assert_eq!(simplify(expr), if_not_null(col("c1"), true)); + +let expr = not_like(col("c1"), lit("%")); +assert_eq!(simplify(expr), if_not_null(col("c1"), false)); + +let expr = ilike(col("c1"), lit("%")); +assert_eq!(simplify(expr), if_not_null(col("c1"), true)); + +let expr = not_ilike(col("c1"), lit("%")); +assert_eq!(simplify(expr), if_not_null(col("c1"), false)); + +// expr [NOT] [I]LIKE '%%' Review Comment: > I thought `expr LIKE '%%'` means the same as `expr = '%'` (match the actual literal character `%` ) > > Specifically that `%%` is how to escape the wildcard to not be a wildcard I think it depends on how we define it. I assume we prefer to follow the Postgres behavior. 🤔 In Postgres, `%%` doesn't mean matching the actual literal '%'. The escape character is `\`. ``` test=# select '1' like '%%'; ?column? -- t (1 row) test=# select '%' like '%%'; ?column? -- t (1 row) test=# select '%' like '\%'; ?column? -- t (1 row) test=# select '1' like '\%'; ?column? -- f (1 row) ``` I think it is just a redundant wildcard. So, we can simplify it to be only one `%`. By the way, in DuckDB, there is a similar behavior for double '%%'. ``` D select '1' like '%%'; ┌───┐ │ ('1' ~~ '%%') │ │boolean│ ├───┤ │ true │ └───┘ D select '%' like '%%'; ┌───┐ │ ('%' ~~ '%%') │ │boolean│ ├───┤ │ true │ └───┘ However, the escape character should be set by `ESCAPE` syntax ``` D select '%' like '\%'; ┌───┐ │ ('%' ~~ '\%') │ │boolean│ ├───┤ │ false │ └───┘ D select '%' like '\%' escape '\'; ┌──┐ │ main.like_escape('%', '\%', '\') │ │ boolean │ ├──┤ │ true │ └──┘ ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
goldmedal commented on code in PR #13260: URL: https://github.com/apache/datafusion/pull/13260#discussion_r1830993512 ## datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs: ## @@ -3633,32 +3697,123 @@ mod tests { #[test] fn test_like_and_ilke() { -// LIKE '%' -let expr = like(col("c1"), "%"); +let null = lit(ScalarValue::Utf8(None)); + +// expr [NOT] [I]LIKE NULL +let expr = like(col("c1"), null.clone()); +assert_eq!(simplify(expr), lit_bool_null()); + +let expr = not_like(col("c1"), null.clone()); +assert_eq!(simplify(expr), lit_bool_null()); + +let expr = ilike(col("c1"), null.clone()); +assert_eq!(simplify(expr), lit_bool_null()); + +let expr = not_ilike(col("c1"), null.clone()); +assert_eq!(simplify(expr), lit_bool_null()); + +// expr [NOT] [I]LIKE '%' +let expr = like(col("c1"), lit("%")); +assert_eq!(simplify(expr), if_not_null(col("c1"), true)); + +let expr = not_like(col("c1"), lit("%")); +assert_eq!(simplify(expr), if_not_null(col("c1"), false)); + +let expr = ilike(col("c1"), lit("%")); +assert_eq!(simplify(expr), if_not_null(col("c1"), true)); + +let expr = not_ilike(col("c1"), lit("%")); +assert_eq!(simplify(expr), if_not_null(col("c1"), false)); + +// expr [NOT] [I]LIKE '%%' Review Comment: > I thought `expr LIKE '%%'` means the same as `expr = '%'` (match the actual literal character `%` ) > > Specifically that `%%` is how to escape the wildcard to not be a wildcard I think it depends on how we define it. I assume we prefer to follow the Postgres behavior. 🤔 In Postgres, `%%` doesn't mean matching the actual literal '%'. In Postgres, the escape character is `\`. ``` test=# select '1' like '%%'; ?column? -- t (1 row) test=# select '%' like '%%'; ?column? -- t (1 row) test=# select '%' like '\%'; ?column? -- t (1 row) test=# select '1' like '\%'; ?column? -- f (1 row) ``` I think it is just a redundant wildcard. So, we can simplify it to be only one `%`. By the way, in DuckDB, there is a similar behavior for double '%%'. ``` D select '1' like '%%'; ┌───┐ │ ('1' ~~ '%%') │ │boolean│ ├───┤ │ true │ └───┘ D select '%' like '%%'; ┌───┐ │ ('%' ~~ '%%') │ │boolean│ ├───┤ │ true │ └───┘ However, the escape character should be set by `ESCAPE` syntax ``` D select '%' like '\%'; ┌───┐ │ ('%' ~~ '\%') │ │boolean│ ├───┤ │ false │ └───┘ D select '%' like '\%' escape '\'; ┌──┐ │ main.like_escape('%', '\%', '\') │ │ boolean │ ├──┤ │ true │ └──┘ ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
goldmedal commented on code in PR #13260: URL: https://github.com/apache/datafusion/pull/13260#discussion_r1830993512 ## datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs: ## @@ -3633,32 +3697,123 @@ mod tests { #[test] fn test_like_and_ilke() { -// LIKE '%' -let expr = like(col("c1"), "%"); +let null = lit(ScalarValue::Utf8(None)); + +// expr [NOT] [I]LIKE NULL +let expr = like(col("c1"), null.clone()); +assert_eq!(simplify(expr), lit_bool_null()); + +let expr = not_like(col("c1"), null.clone()); +assert_eq!(simplify(expr), lit_bool_null()); + +let expr = ilike(col("c1"), null.clone()); +assert_eq!(simplify(expr), lit_bool_null()); + +let expr = not_ilike(col("c1"), null.clone()); +assert_eq!(simplify(expr), lit_bool_null()); + +// expr [NOT] [I]LIKE '%' +let expr = like(col("c1"), lit("%")); +assert_eq!(simplify(expr), if_not_null(col("c1"), true)); + +let expr = not_like(col("c1"), lit("%")); +assert_eq!(simplify(expr), if_not_null(col("c1"), false)); + +let expr = ilike(col("c1"), lit("%")); +assert_eq!(simplify(expr), if_not_null(col("c1"), true)); + +let expr = not_ilike(col("c1"), lit("%")); +assert_eq!(simplify(expr), if_not_null(col("c1"), false)); + +// expr [NOT] [I]LIKE '%%' Review Comment: > I thought `expr LIKE '%%'` means the same as `expr = '%'` (match the actual literal character `%` ) > > Specifically that `%%` is how to escape the wildcard to not be a wildcard I think it depends on how we define it. I assume we prefer to follow the Postgres behavior. 🤔 In Postgres, `%%` doesn't mean matching the actual literal '%'. The escape character is `\`. ``` test=# select '1' like '%%'; ?column? -- t (1 row) test=# select '%' like '%%'; ?column? -- t (1 row) test=# select '%' like '\%'; ?column? -- t (1 row) test=# select '1' like '\%'; ?column? -- f (1 row) ``` I think `%%` is just a redundant wildcard. So, we can simplify it to be only one `%`. By the way, in DuckDB, there is a similar behavior for double '%%'. ``` D select '1' like '%%'; ┌───┐ │ ('1' ~~ '%%') │ │boolean│ ├───┤ │ true │ └───┘ D select '%' like '%%'; ┌───┐ │ ('%' ~~ '%%') │ │boolean│ ├───┤ │ true │ └───┘ However, the escape character should be set by `ESCAPE` syntax ``` D select '%' like '\%'; ┌───┐ │ ('%' ~~ '\%') │ │boolean│ ├───┤ │ false │ └───┘ D select '%' like '\%' escape '\'; ┌──┐ │ main.like_escape('%', '\%', '\') │ │ boolean │ ├──┤ │ true │ └──┘ ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org
Re: [PR] Expand LIKE simplification: cover `NULL` pattern/expression and constant [datafusion]
goldmedal commented on code in PR #13260: URL: https://github.com/apache/datafusion/pull/13260#discussion_r1831061785 ## datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs: ## @@ -3633,32 +3697,123 @@ mod tests { #[test] fn test_like_and_ilke() { -// LIKE '%' -let expr = like(col("c1"), "%"); +let null = lit(ScalarValue::Utf8(None)); + +// expr [NOT] [I]LIKE NULL +let expr = like(col("c1"), null.clone()); Review Comment: ```suggestion let expr = col("c1").like(null.clone()); ``` As @alamb's comment (https://github.com/apache/datafusion/pull/13260#discussion_r1830873028), we can do some refactor like this. ## datafusion/sqllogictest/test_files/string/string_view.slt: ## @@ -396,8 +396,9 @@ EXPLAIN SELECT FROM test; logical_plan -01)Projection: test.column1_utf8view LIKE Utf8View("foo") AS like, test.column1_utf8view ILIKE Utf8View("foo") AS ilike -02)--TableScan: test projection=[column1_utf8view] +01)Projection: __common_expr_1 AS like, __common_expr_1 AS ilike Review Comment: Agreed. I did some tests. I think they still use the native StringView implementation for pattern matching. ``` query TT EXPLAIN SELECT column1_utf8view like '%foo%' as "like", column1_utf8view ilike '%foo%' as "ilike" FROM test; logical_plan 01)Projection: test.column1_utf8view LIKE Utf8View("%foo%") AS like, test.column1_utf8view ILIKE Utf8View("%foo%") AS ilike 02)--TableScan: test projection=[column1_utf8view] ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org For additional commands, e-mail: github-h...@datafusion.apache.org