Mihaly Szjatinya has posted comments on this change. ( http://gerrit.cloudera.org:8080/21825 )
Change subject: IMPALA-889: Add trim() function matching ANSI SQL definition ...................................................................... Patch Set 8: (6 comments) Added keywords for trim options. Restructured and added some new tests. http://gerrit.cloudera.org:8080/#/c/21825/6//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/21825/6//COMMIT_MSG@55 PS6, Line 55: However, despite : that logic in the code, having "TRIM" just as a keyword would still : result in syntax error (e.g. in queries like: "create table t1(tr > With the current state of the patch "create table t1(trim int)" would succe Done http://gerrit.cloudera.org:8080/#/c/21825/6/fe/src/main/cup/sql-parser.cup File fe/src/main/cup/sql-parser.cup: http://gerrit.cloudera.org:8080/#/c/21825/6/fe/src/main/cup/sql-parser.cup@3870 PS6, Line 3870: /DATE ts)" > I don't get it, how does this affect sum(count(id))? All the new causes con I have caught a test breaking on this without string_expr, but cannot reproduce now; removing the comment. string_expr still makes sense. http://gerrit.cloudera.org:8080/#/c/21825/6/fe/src/main/jflex/sql-scanner.flex File fe/src/main/jflex/sql-scanner.flex: http://gerrit.cloudera.org:8080/#/c/21825/6/fe/src/main/jflex/sql-scanner.flex@421 PS6, Line 421: "check", "classifier", "close > I still think that it would be better to keep these as reserved and add the Discussed offline, added keywords. http://gerrit.cloudera.org:8080/#/c/21825/6/testdata/workloads/functional-query/queries/QueryTest/utf8-string-functions.test File testdata/workloads/functional-query/queries/QueryTest/utf8-string-functions.test: http://gerrit.cloudera.org:8080/#/c/21825/6/testdata/workloads/functional-query/queries/QueryTest/utf8-string-functions.test@398 PS6, Line 398: ---- QUERY > I would prefer to reorganize the tests a bit: >I would prefer to reorganize the tests a bit: >For queries that are semantically equivalent to existing trim/rtrim/ltrim >tests in this file, I would put the new test next to >the original, e.g. >after >select id, name, rtrim(name, '?????') from utf8_str_tiny; >I would put >select id, name, trim(trailing name, '?????') from utf8_str_tiny; Done >For other tests I don't think that the utf8 handling is important, so for >example >select id, name, trim(left('??????‘’“”????????',5) from name) from >utf8_str_tiny; >doesn't seem to trim anything from the dataset. A simple test like >select id, name, trim(left('??????‘’“”????????',5) from "!a!"; >seems more useful. >Also, tests where utf8 handling is not relevant could be moved to other files, >for example expr-test.cc or exprs.test. >Currently most trim tests are in >expr-test.cc I moved tests that don't require table and are not utf8 related to expr-test.cc. There's only a few left and, since I haven't found e2e tests for regular trim, I'd suggest to keep those here, not to have them overly scattered. http://gerrit.cloudera.org:8080/#/c/21825/6/testdata/workloads/functional-query/queries/QueryTest/utf8-string-functions.test@400 PS6, Line 400: leading > I don't see any test for `both` - can you add one test with charset, and on Added to expr-test.cc http://gerrit.cloudera.org:8080/#/c/21825/6/testdata/workloads/functional-query/queries/QueryTest/utf8-string-functions.test@700 PS6, Line 700: > The current tests always use utf8_mode=true, while this test uses utf8_trim Yes, this test is specifically for this syntax. -- To view, visit http://gerrit.cloudera.org:8080/21825 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I3c4fa6d0d8d0684c4b6d8dac8fd531d205e4f7b4 Gerrit-Change-Number: 21825 Gerrit-PatchSet: 8 Gerrit-Owner: Mihaly Szjatinya <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Daniel Becker <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Mihaly Szjatinya <[email protected]> Gerrit-Reviewer: Noemi Pap-Takacs <[email protected]> Gerrit-Reviewer: Peter Rozsa <[email protected]> Gerrit-Comment-Date: Wed, 23 Oct 2024 21:34:58 +0000 Gerrit-HasComments: Yes
