Re: [PR] Expand LIKE simplification [datafusion]
alamb commented on code in PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#discussion_r1830873028
##
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##
@@ -2987,41 +3051,41 @@ mod tests {
})
}
-fn like(expr: Expr, pattern: &str) -> Expr {
+fn like(expr: Expr, pattern: impl Into) -> Expr {
Review Comment:
I think we could use Expr::like
https://docs.rs/datafusion/latest/datafusion/logical_expr/enum.Expr.html#method.like
and similar here instead of these functions
This is likely left over from when the Expr API was less expressive
##
datafusion/sqllogictest/test_files/string/string_view.slt:
##
@@ -396,8 +396,9 @@ EXPLAIN SELECT
FROM test;
logical_plan
-01)Projection: test.column1_utf8view LIKE Utf8View("foo") AS like,
test.column1_utf8view ILIKE Utf8View("foo") AS ilike
-02)--TableScan: test projection=[column1_utf8view]
+01)Projection: __common_expr_1 AS like, __common_expr_1 AS ilike
Review Comment:
Nice!
Though I think this test's meaning is now changed it is supposed to be
verifying cast exprs for like
Perhaps you can change the patterns to `like '%foo%'`
##
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##
@@ -1470,34 +1471,67 @@ impl<'a, S: SimplifyInfo> TreeNodeRewriter for
Simplifier<'a, S> {
}) => Transformed::yes(simplify_regex_expr(left, op, right)?),
// Rules for Like
-Expr::Like(Like {
-expr,
-pattern,
-negated,
-escape_char: _,
-case_insensitive: _,
-}) if matches!(
-pattern.as_ref(),
-Expr::Literal(ScalarValue::Utf8(Some(pattern_str))) if
pattern_str == "%"
-) || matches!(
-pattern.as_ref(),
-Expr::Literal(ScalarValue::LargeUtf8(Some(pattern_str))) if
pattern_str == "%"
-) || matches!(
-pattern.as_ref(),
-Expr::Literal(ScalarValue::Utf8View(Some(pattern_str))) if
pattern_str == "%"
-) =>
-{
-// exp LIKE '%' is
-// - when exp is not NULL, it's true
-// - when exp is NULL, it's NULL
-// exp NOT LIKE '%' is
-// - when exp is not NULL, it's false
-// - when exp is NULL, it's NULL
-Transformed::yes(Expr::Case(Case {
-expr: Some(Box::new(Expr::IsNotNull(expr))),
-when_then_expr: vec![(Box::new(lit(true)),
Box::new(lit(!negated)))],
-else_expr: None,
-}))
+Expr::Like(like) => {
+match as_string_scalar(&like.pattern) {
+Some((data_type, pattern_str)) => {
+match pattern_str {
+None => return
Ok(Transformed::yes(lit_bool_null())),
+Some(pattern_str) if pattern_str == "%" => {
+// exp LIKE '%' is
+// - when exp is not NULL, it's true
+// - when exp is NULL, it's NULL
+// exp NOT LIKE '%' is
+// - when exp is not NULL, it's false
+// - when exp is NULL, it's NULL
+let result_for_non_null = lit(!like.negated);
+Transformed::yes(if
!info.nullable(&like.expr)? {
+result_for_non_null
+} else {
+Expr::Case(Case {
+expr:
Some(Box::new(Expr::IsNotNull(like.expr))),
+when_then_expr: vec![(
+Box::new(lit(true)),
+Box::new(result_for_non_null),
+)],
+else_expr: None,
+})
+})
+}
+Some(pattern_str)
Review Comment:
I am not sure about this rule (see questions below) -- maybe you could add
some comments explanation on what it is supposed to do.
##
datafusion/optimizer/src/simplify_expressions/expr_simplifier.rs:
##
@@ -3633,32 +3697,123 @@ mod tests {
#[test]
fn test_like_and_ilke() {
-// LIKE '%'
-let expr = like(col("c1"), "%");
+let null = lit(ScalarValue::Utf8(None));
+
+// expr [NOT] [I]LIKE NULL
+let expr = like(col("c1"), null.clone());
+assert_eq!(simplify(expr), lit_bool_null());
+
+let expr = not_like(col("c1"), null
Re: [PR] Expand LIKE simplification [datafusion]
findepi commented on PR #13260: URL: https://github.com/apache/datafusion/pull/13260#issuecomment-2458895362 @goldmedal rebased, thanks! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Expand LIKE simplification [datafusion]
goldmedal commented on PR #13260: URL: https://github.com/apache/datafusion/pull/13260#issuecomment-2458624344 > draft - to be rebased after #13259 lands > > still ready to review cc @crepererum @goldmedal #13259 has been merged. š -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Expand LIKE simplification [datafusion]
findepi commented on code in PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#discussion_r1829718515
##
datafusion/optimizer/Cargo.toml:
##
@@ -47,6 +47,7 @@ indexmap = { workspace = true }
itertools = { workspace = true }
log = { workspace = true }
paste = "1.0.14"
+regex = "1.11.0"
Review Comment:
good catch, done
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Expand LIKE simplification [datafusion]
goldmedal commented on code in PR #13260:
URL: https://github.com/apache/datafusion/pull/13260#discussion_r1829563391
##
datafusion/optimizer/Cargo.toml:
##
@@ -47,6 +47,7 @@ indexmap = { workspace = true }
itertools = { workspace = true }
log = { workspace = true }
paste = "1.0.14"
+regex = "1.11.0"
Review Comment:
```suggestion
regex = { workspace = true }
```
Although the version is different, I think it works well.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
Re: [PR] Expand LIKE simplification [datafusion]
findepi commented on PR #13260: URL: https://github.com/apache/datafusion/pull/13260#issuecomment-2457460720 draft - to be rebased after https://github.com/apache/datafusion/pull/13259 lands still ready to review cc @crepererum @goldmedal -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Expand LIKE simplification [datafusion]
adriangb commented on PR #13260: URL: https://github.com/apache/datafusion/pull/13260#issuecomment-2457442850 Iām happy with my changes being included in this PR :) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Expand LIKE simplification [datafusion]
findepi commented on PR #13260: URL: https://github.com/apache/datafusion/pull/13260#issuecomment-2457435093 seemed easy enough, done. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Expand LIKE simplification [datafusion]
findepi commented on PR #13260: URL: https://github.com/apache/datafusion/pull/13260#issuecomment-2457414646 CI is green, so ready to review but this will conflict with @adriangb's https://github.com/apache/datafusion/pull/13061, because this changes code structure a little bit conflict resolution will be trivial though. @adriangb if you want, i can do this, adding your changes to this PR -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
Re: [PR] Expand LIKE simplification [datafusion]
findepi commented on PR #13260: URL: https://github.com/apache/datafusion/pull/13260#issuecomment-2457210525 currently depends on https://github.com/apache/datafusion/pull/13259 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] - To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
