[Impala-ASF-CR] IMPALA-2422: Fix escaping in the LIKE clause
Tim Armstrong has abandoned this change. ( http://gerrit.cloudera.org:8080/10910 ) Change subject: IMPALA-2422: Fix escaping in the LIKE clause .. Abandoned -- To view, visit http://gerrit.cloudera.org:8080/10910 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: abandon Gerrit-Change-Id: I553412318525820a36d2f401aa7e93958d22f70e Gerrit-Change-Number: 10910 Gerrit-PatchSet: 2 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-2422: Fix escaping in the LIKE clause
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10910 ) Change subject: IMPALA-2422: Fix escaping in the LIKE clause .. Patch Set 2: Code-Review-1 This is stale but I'm -1ing because of the issue I previously mentioned. -- To view, visit http://gerrit.cloudera.org:8080/10910 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I553412318525820a36d2f401aa7e93958d22f70e Gerrit-Change-Number: 10910 Gerrit-PatchSet: 2 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Mon, 13 Aug 2018 16:59:05 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2422: Fix escaping in the LIKE clause
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10910 ) Change subject: IMPALA-2422: Fix escaping in the LIKE clause .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/10910/1/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java File fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java: http://gerrit.cloudera.org:8080/#/c/10910/1/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java@a304 PS1, Line 304: > Not quite sure what you mean. The test is the same as before, except we rem my question was about backwards compatibility. lets update the jira description to emphasize that the bug fix, while good that it tries to fix the issue, will change existing queries. -- To view, visit http://gerrit.cloudera.org:8080/10910 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I553412318525820a36d2f401aa7e93958d22f70e Gerrit-Change-Number: 10910 Gerrit-PatchSet: 2 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 12 Jul 2018 22:21:52 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2422: Fix escaping in the LIKE clause
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10910 ) Change subject: IMPALA-2422: Fix escaping in the LIKE clause .. Patch Set 2: (1 comment) http://gerrit.cloudera.org:8080/#/c/10910/2/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: http://gerrit.cloudera.org:8080/#/c/10910/2/be/src/exprs/expr-test.cc@3481 PS2, Line 3481: // We return a different result if the second argument is not a string literal. Am I understanding this right that LIKE now behaves differently depending on whether its argument is constant or a variable that evaluates to the same value? That's going to cause a lot of issues... -- To view, visit http://gerrit.cloudera.org:8080/10910 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I553412318525820a36d2f401aa7e93958d22f70e Gerrit-Change-Number: 10910 Gerrit-PatchSet: 2 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 12 Jul 2018 21:21:41 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2422: Fix escaping in the LIKE clause
Taras Bobrovytsky has uploaded a new patch set (#2). ( http://gerrit.cloudera.org:8080/10910 ) Change subject: IMPALA-2422: Fix escaping in the LIKE clause .. IMPALA-2422: Fix escaping in the LIKE clause There are two stages to processing a like clause. First, we determine if it is possible to convert the expression to a simpler function, such as StartsWith() or EndsWith(). If not, then we use a Regex libarary to compute the result. There was a problem in the logic that determines if it is possible to use a simpler function. It did not take into account escape characters. For example, "abc\%" was incorrectly converted to StartsWith("abc\"). There was another problem. We always unescaped strings in the frontend. The RE2 regex function also unescapes the regex before proceeding. So regexes were unescaped twice, which caused some ambiguity. For example, "abc\%" and "abc\\%" are unescaped in the frontend and the same pattern, "abc\%" is sent to the backend. The backend could not decide if this pattern is an exact or prefix match. To fix this problem, we avoid unescaping regex pattens in the frontend. Testing: -Added expr tests. Change-Id: I553412318525820a36d2f401aa7e93958d22f70e --- M be/src/exprs/expr-test.cc M be/src/exprs/like-predicate.cc M fe/src/main/java/org/apache/impala/analysis/LikePredicate.java M fe/src/main/java/org/apache/impala/analysis/StringLiteral.java M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java 5 files changed, 68 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/10910/2 -- To view, visit http://gerrit.cloudera.org:8080/10910 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I553412318525820a36d2f401aa7e93958d22f70e Gerrit-Change-Number: 10910 Gerrit-PatchSet: 2 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vuk Ercegovac
[Impala-ASF-CR] IMPALA-2422: Fix escaping in the LIKE clause
Taras Bobrovytsky has posted comments on this change. ( http://gerrit.cloudera.org:8080/10910 ) Change subject: IMPALA-2422: Fix escaping in the LIKE clause .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/10910/1/be/src/exprs/like-predicate.cc File be/src/exprs/like-predicate.cc: http://gerrit.cloudera.org:8080/#/c/10910/1/be/src/exprs/like-predicate.cc@265 PS1, Line 265:StringValue tmp_val = StringValue::FromStringVal(val); > ws Oops , this was temporary garbage that I forgot to remove (same as below). http://gerrit.cloudera.org:8080/#/c/10910/1/fe/src/main/java/org/apache/impala/analysis/LikePredicate.java File fe/src/main/java/org/apache/impala/analysis/LikePredicate.java: http://gerrit.cloudera.org:8080/#/c/10910/1/fe/src/main/java/org/apache/impala/analysis/LikePredicate.java@132 PS1, Line 132: StringLiteral > basic question: what happens with the interaction between constant folding If the second child becomes a string literal due to constant folding, then the NeedsUnescaping will not be flipped to false here. And I think this is OK. I added a test for this. http://gerrit.cloudera.org:8080/#/c/10910/1/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java File fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java: http://gerrit.cloudera.org:8080/#/c/10910/1/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java@a304 PS1, Line 304: > basic question: this changes the test. are inputs expected to change? just Not quite sure what you mean. The test is the same as before, except we remove one level of escaping. Before, with four slashes the escaping worked as follows: 1. The first level escaped the Java escaping 2. The second level escaped the FE escaping. So the BE would receive this: "\_" After my change, the backend receives the same thing because the FE escaping is eliminated. -- To view, visit http://gerrit.cloudera.org:8080/10910 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I553412318525820a36d2f401aa7e93958d22f70e Gerrit-Change-Number: 10910 Gerrit-PatchSet: 1 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Thu, 12 Jul 2018 02:43:08 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2422: Fix escaping in the LIKE clause
Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/10910 ) Change subject: IMPALA-2422: Fix escaping in the LIKE clause .. Patch Set 1: How does this related to the previous version of this patch? Would be great to get this fixed. -- To view, visit http://gerrit.cloudera.org:8080/10910 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I553412318525820a36d2f401aa7e93958d22f70e Gerrit-Change-Number: 10910 Gerrit-PatchSet: 1 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Tim Armstrong Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 11 Jul 2018 17:02:10 + Gerrit-HasComments: No
[Impala-ASF-CR] IMPALA-2422: Fix escaping in the LIKE clause
Vuk Ercegovac has posted comments on this change. ( http://gerrit.cloudera.org:8080/10910 ) Change subject: IMPALA-2422: Fix escaping in the LIKE clause .. Patch Set 1: (3 comments) http://gerrit.cloudera.org:8080/#/c/10910/1/be/src/exprs/like-predicate.cc File be/src/exprs/like-predicate.cc: http://gerrit.cloudera.org:8080/#/c/10910/1/be/src/exprs/like-predicate.cc@265 PS1, Line 265:StringValue tmp_val = StringValue::FromStringVal(val); ws http://gerrit.cloudera.org:8080/#/c/10910/1/fe/src/main/java/org/apache/impala/analysis/LikePredicate.java File fe/src/main/java/org/apache/impala/analysis/LikePredicate.java: http://gerrit.cloudera.org:8080/#/c/10910/1/fe/src/main/java/org/apache/impala/analysis/LikePredicate.java@132 PS1, Line 132: StringLiteral basic question: what happens with the interaction between constant folding and analyzing this predicate? my understanding is that child[1] will get replaced; if its replaced with some string that should be unescaped, is it analyzed at the right time so that this flag is flipped? if there isn't a test for this, lets add one. http://gerrit.cloudera.org:8080/#/c/10910/1/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java File fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java: http://gerrit.cloudera.org:8080/#/c/10910/1/fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java@a304 PS1, Line 304: basic question: this changes the test. are inputs expected to change? just checking if there is any compatibility issue here. -- To view, visit http://gerrit.cloudera.org:8080/10910 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I553412318525820a36d2f401aa7e93958d22f70e Gerrit-Change-Number: 10910 Gerrit-PatchSet: 1 Gerrit-Owner: Taras Bobrovytsky Gerrit-Reviewer: Vuk Ercegovac Gerrit-Comment-Date: Wed, 11 Jul 2018 02:19:36 + Gerrit-HasComments: Yes
[Impala-ASF-CR] IMPALA-2422: Fix escaping in the LIKE clause
Taras Bobrovytsky has uploaded this change for review. ( http://gerrit.cloudera.org:8080/10910 Change subject: IMPALA-2422: Fix escaping in the LIKE clause .. IMPALA-2422: Fix escaping in the LIKE clause There are two stages to processing a like clause. First, we determine if it is possible to convert the expression to a simpler function, such as StartsWith() or EndsWith(). If not, then we use a Regex libarary to compute the result. There was a problem in the logic that determines if it is possible to use a simpler function. It did not take into account escape characters. For example, "abc\%" was incorrectly converted to StartsWith("abc\"). There was another problem. We always unescaped strings in the frontend. The RE2 regex function also unescapes the regex before proceeding. So regexes were unescaped twice, which caused some ambiguity. For example, "abc\%" and "abc\\%" are unescaped in the frontend and the same pattern, "abc\%" is sent to the backend. The backend could not decide if this pattern is an exact or prefix match. To fix this problem, we avoid unescaping regex pattens in the frontend. Testing: -Added expr tests. Change-Id: I553412318525820a36d2f401aa7e93958d22f70e --- M be/src/exprs/expr-test.cc M be/src/exprs/like-predicate.cc M fe/src/main/java/org/apache/impala/analysis/LikePredicate.java M fe/src/main/java/org/apache/impala/analysis/StringLiteral.java M fe/src/test/java/org/apache/impala/analysis/ExprRewriteRulesTest.java 5 files changed, 77 insertions(+), 23 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/10/10910/1 -- To view, visit http://gerrit.cloudera.org:8080/10910 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I553412318525820a36d2f401aa7e93958d22f70e Gerrit-Change-Number: 10910 Gerrit-PatchSet: 1 Gerrit-Owner: Taras Bobrovytsky