Jim Apple has posted comments on this change. Change subject: IMPALA-889: Add support for an ISO-SQL compliant trim() function. ......................................................................
Patch Set 6: (32 comments) http://gerrit.cloudera.org:8080/#/c/4474/1//COMMIT_MSG Commit Message: Line 10: specified direction(s) of a STRING value. > Greetings, Jim. Then change the name of the new function away from btrim to something else. http://gerrit.cloudera.org:8080/#/c/4474/6//COMMIT_MSG Commit Message: PS6, Line 14: Syntax #3 confirms the standard SQL syntax (Core SQL feature ID All of them are part of the standard, as is simply TRIM(string_to_be_trimmed). http://savage.net.au/SQL/sql-92.bnf.html PS6, Line 18: | just use regular prose here: Valid options are "leading", "trailing", and "both" PS6, Line 22: empty argument What does "empty argument" mean in this case? PS6, Line 22: NULL How is a NULL passed? Is it literally just "trim(NULL from 'foo')"? PS6, Line 23: returns not just "returns", "TRIM returns". Here and below. PS6, Line 25: "trim_char_set": Case-sensitive characters to be removed. This Please separate paragraphs by a blank line. PS6, Line 29: " " Is the standard space or all whitespace, including tabs, newlines, etc.? PS6, Line 32: Blank How is a "blank" argument different than an "empty argument" above? PS6, Line 33: ote: For Syntax #1, since no "characters" is specified, it trims : " "(space) by default. For Syntax #2, since no "where" is specified, : the option both is implied by default. Please split this up and put it above in the text on each option. PS6, Line 44: @&@&@&@ This is hard to read. Can you stick to alphanumeric characters in these examples, please? The smaller the input that illustrates the point, the clearer it is. This applies to tests, as well. http://gerrit.cloudera.org:8080/#/c/4474/6/be/src/exprs/expr-test.cc File be/src/exprs/expr-test.cc: Line 1977: TestStringValue("trim(leading FROM ' &@&127+ &@ ')", "&@&127+ &@ "); Please try to have each test case test one thing different from every other test case. What a case is testing should be clear by looking at it. The smaller the test case is, the better. As an example, would this test case be better as "(trim leading FROM ' a b ')"? http://gerrit.cloudera.org:8080/#/c/4474/6/be/src/exprs/string-functions-ir.cc File be/src/exprs/string-functions-ir.cc: Line 788: static inline bitset<256>* TrimInternalPrepare(FunctionContext* ctx, Please use unnamed namespaces; see https://google.github.io/styleguide/cppguide.html PS6, Line 805: bitset const PS6, Line 806: & Out parameters, when used, are not references. https://google.github.io/styleguide/cppguide.html#Reference_Arguments PS6, Line 806: begin prefer to return, not pass out parameters. PS6, Line 807: test Please use [], not test - we don't need the bounds checking. PS6, Line 807: static_cast<int> What type is str.ptr[begin]? What type does test take? Do you need this cast? Why or why not? PS6, Line 813: int & begin is not modified. PS6, Line 822: bitset const PS6, Line 824: int32_t Don't use int32_t and int interchangeably without a specific reason PS6, Line 835: str The original functions made a new string val, rather than returning this one. Do you know why? Line 848: option.ptr[i] = std::tolower(option.ptr[i]); Though awkward, please insert the required casts here. PS6, Line 851: TrimOption This should be done at parsing/analysis time, for efficiency reasons PS6, Line 855: INVALID LEADING = 1, TRAILING = 2, BOTH = LEADING | TRAILING, INVALID = ~BOTH. The comparison on line 886 gets simplified. PS6, Line 858: opt_ https://google.github.io/styleguide/cppguide.html#Variable_Names PS6, Line 861: ( Please use clang-format to format your code. PS6, Line 861: 7 try to avoid repeating literals, like 7. Maybe set static constexpr char LEADING[] = "leading", then check if option.len == sizeof(LEADING)-1? See how that looks. PS6, Line 863: both bad copy PS6, Line 866: both bad copy http://gerrit.cloudera.org:8080/#/c/4474/1/be/src/exprs/string-functions.h File be/src/exprs/string-functions.h: PS1, Line 123: Base6 And while you're changing the name, no need to call this BTrim. http://gerrit.cloudera.org:8080/#/c/4474/1/common/function-registry/impala_functions.py File common/function-registry/impala_functions.py: PS1, Line 478: ezza This is the string you should change. Change it to something else, like sql92trim, then make that invisible. -- To view, visit http://gerrit.cloudera.org:8080/4474 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I4753c608b0b00569bf8c5e95b132df6df358e602 Gerrit-PatchSet: 6 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Youwei Wang <youwei.a.w...@intel.com> Gerrit-Reviewer: Alex Behm <alex.b...@cloudera.com> Gerrit-Reviewer: Jim Apple <jbap...@cloudera.com> Gerrit-Reviewer: Mostafa Mokhtar <mmokh...@cloudera.com> Gerrit-Reviewer: Youwei Wang <youwei.a.w...@intel.com> Gerrit-HasComments: Yes