Tim Armstrong has posted comments on this change. ( http://gerrit.cloudera.org:8080/11814 )
Change subject: IMPALA-7586: fix predicate pushdown of escaped strings ...................................................................... Patch Set 5: (6 comments) http://gerrit.cloudera.org:8080/#/c/11814/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/11814/5//COMMIT_MSG@9 PS5, Line 9: This fixes a class of bugs where the planner incorrectly uses the raw : string from the parser instead of the unescaped string. > fyi, I have found a similar issue in partition pruning, see IMPALA-7784. ack http://gerrit.cloudera.org:8080/#/c/11814/5//COMMIT_MSG@26 PS5, Line 26: Added regression test that tests handling of backslash escapes on all file : formats. I did not add a regression test for the data source bug since it > nit: long lines, please wrap at 72 Done http://gerrit.cloudera.org:8080/#/c/11814/5/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java File fe/src/main/java/org/apache/impala/planner/KuduScanNode.java: http://gerrit.cloudera.org:8080/#/c/11814/5/fe/src/main/java/org/apache/impala/planner/KuduScanNode.java@319 PS5, Line 319: getExplainString > It looks like this always prints the normalized/unescaped string even thoug The behaviour here seems ok to me - it calls toSql(), which consistently escapes things. If we didn't escape things, then the expressions in the explain output might not be valid expressions in our SQL expressions, which seems inconvenient. http://gerrit.cloudera.org:8080/#/c/11814/5/testdata/data/strings_with_quotes.csv File testdata/data/strings_with_quotes.csv: http://gerrit.cloudera.org:8080/#/c/11814/5/testdata/data/strings_with_quotes.csv@11 PS5, Line 11: > add foo\"bar, 11 too? That shouldn't be returned with ..where s = "foo\"bar Done. I assume you meant that you wanted the value in the table in include the backslash, which requires a double backslash here. http://gerrit.cloudera.org:8080/#/c/11814/5/testdata/workloads/functional-query/queries/QueryTest/string-escaping-rcfile-bug.test File testdata/workloads/functional-query/queries/QueryTest/string-escaping-rcfile-bug.test: http://gerrit.cloudera.org:8080/#/c/11814/5/testdata/workloads/functional-query/queries/QueryTest/string-escaping-rcfile-bug.test@1 PS5, Line 1: ==== > Can you add an extra comment to make it even clearer that this file contain Done http://gerrit.cloudera.org:8080/#/c/11814/5/testdata/workloads/functional-query/queries/QueryTest/string-escaping-rcfile-bug.test@4 PS5, Line 4: # IMPALA-7778: escapes are ignored so output is incorrect > Can't we just xfail if format == 'rc' instead of doing this? or is it becau I think it's better to exercise and track the known-bad behaviour than leave it unexercised. Also, honestly, I don't know if there is enough usage of RC at this point that anyone will be motivated to fix it. -- To view, visit http://gerrit.cloudera.org:8080/11814 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I53d6e20dd48ab6837ddd325db8a9d49ee04fed28 Gerrit-Change-Number: 11814 Gerrit-PatchSet: 5 Gerrit-Owner: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Reviewer: Bharath Vissapragada <bhara...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Tim Armstrong <tarmstr...@cloudera.com> Gerrit-Comment-Date: Tue, 30 Oct 2018 19:28:43 +0000 Gerrit-HasComments: Yes