Daniel Becker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/21825 )

Change subject: IMPALA-889: Add trim() function matching ANSI SQL definition
......................................................................


Patch Set 2:

(19 comments)

Thanks for this change, Mihály!

http://gerrit.cloudera.org:8080/#/c/21825/2//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/21825/2//COMMIT_MSG@14
PS2, Line 14: MPALA
Nit: IMPALA-...


http://gerrit.cloudera.org:8080/#/c/21825/2//COMMIT_MSG@16
PS2, Line 16: https://gerrit.sjc.cloudera.com/c/Impala/+/4579
This is an internal link, so it would be better to have a Github link to this 
commit: 
https://github.com/apache/impala/commit/543fa73f3a846f0e4527514c993cb0985912b06c


http://gerrit.cloudera.org:8080/#/c/21825/2//COMMIT_MSG@52
PS2, Line 52: a
Nit: negative


http://gerrit.cloudera.org:8080/#/c/21825/2//COMMIT_MSG@64
PS2, Line 64: 2. Escaping "where". Since "leading", "trailing", and "both" are 
inherited
Nit: line too long.


http://gerrit.cloudera.org:8080/#/c/21825/2/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

http://gerrit.cloudera.org:8080/#/c/21825/2/fe/src/main/cup/sql-parser.cup@3805
PS2, Line 3805:   unnest_expr:e
Shouldn't "unnest_expr" also incuded in "string_expr"? Similarly to this:
  select trim(t.string_col FROM "0101") from alltypestiny t;
this could also work:
  select trim(unnest(arr2) from "onetwothree") from  complextypes_arrays;


http://gerrit.cloudera.org:8080/#/c/21825/2/fe/src/main/cup/sql-parser.cup@3829
PS2, Line 3829:
We could add an EE test for each of these cases as the "charset" parameter of 
TRIM.


http://gerrit.cloudera.org:8080/#/c/21825/2/fe/src/main/cup/sql-parser.cup@3873
PS2, Line 3873: compromizing syntax like
Could you elaborate on it, I don't completely understand the example. Maybe a 
complete wrong example would be better, including the TRIM function.


http://gerrit.cloudera.org:8080/#/c/21825/2/fe/src/main/cup/sql-parser.cup@3873
PS2, Line 3873: string_expr is needed here
This is also true for the version above, isn't it? Then it would be better to 
put it there and indicate that it refers to both cases.


http://gerrit.cloudera.org:8080/#/c/21825/2/fe/src/main/java/org/apache/impala/analysis/TrimFromExpr.java
File fe/src/main/java/org/apache/impala/analysis/TrimFromExpr.java:

http://gerrit.cloudera.org:8080/#/c/21825/2/fe/src/main/java/org/apache/impala/analysis/TrimFromExpr.java@45
PS2, Line 45: fnName_
This field shadows the field with the same name from the base class. We could 
use another name, e.g. "trimFromFnName_".


http://gerrit.cloudera.org:8080/#/c/21825/2/fe/src/main/java/org/apache/impala/analysis/TrimFromExpr.java@51
PS2, Line 51:   private static final Set<String> TRIM_OPTIONS;
This could instead be an enum.


http://gerrit.cloudera.org:8080/#/c/21825/2/fe/src/main/java/org/apache/impala/analysis/TrimFromExpr.java@60
PS2, Line 60: B
Nit: should start with lower-case letter.


http://gerrit.cloudera.org:8080/#/c/21825/2/fe/src/main/java/org/apache/impala/analysis/TrimFromExpr.java@69
PS2, Line 69:       // Error will be raised on analysis stage. For now set 
btrim as default.
Can't we raise an exception here?


http://gerrit.cloudera.org:8080/#/c/21825/2/fe/src/main/java/org/apache/impala/analysis/TrimFromExpr.java@77
PS2, Line 77:     this(fnName, where, null, srcexpr);
I'm wondering whether it's a good idea to raise an error when the "where" 
parameter is explicitly given (so this constructor is called and not the one on 
L81) AND it is NULL.
I don't think it's good practice to give NULL as the parameter and rely on it 
defaulting to "both".


http://gerrit.cloudera.org:8080/#/c/21825/2/fe/src/main/java/org/apache/impala/analysis/TrimFromExpr.java@98
PS2, Line 98:    * Copy c'tor used in clone().
We should also override clone(), otherwise this copy constructor won't be used.


http://gerrit.cloudera.org:8080/#/c/21825/2/fe/src/main/java/org/apache/impala/analysis/TrimFromExpr.java@111
PS2, Line 111: extract
It should be "trim()", shouldn't it?


http://gerrit.cloudera.org:8080/#/c/21825/2/fe/src/main/java/org/apache/impala/analysis/TrimFromExpr.java@113
PS2, Line 113: fnName_.getFunction()
This could be extracted into a variable.


http://gerrit.cloudera.org:8080/#/c/21825/2/fe/src/main/java/org/apache/impala/analysis/TrimFromExpr.java@143
PS2, Line 143:       if (errMsg.length() > 0) { errMsg.append(" "); }
If the block body firs on one line, we omit the braces in Impala.


http://gerrit.cloudera.org:8080/#/c/21825/2/fe/src/main/java/org/apache/impala/analysis/TrimFromExpr.java@144
PS2, Line 144: errMsg.append("Expression '" + srcexpr_.toSql() + "' has a 
return type of ");
             :       errMsg.append(srcexpr_.getType().toSql() + " but a STRING 
is required.");
This could be extracted into a function that could also be used on L139.


http://gerrit.cloudera.org:8080/#/c/21825/2/testdata/workloads/functional-query/queries/QueryTest/utf8-string-functions.test
File 
testdata/workloads/functional-query/queries/QueryTest/utf8-string-functions.test:

http://gerrit.cloudera.org:8080/#/c/21825/2/testdata/workloads/functional-query/queries/QueryTest/utf8-string-functions.test@398
PS2, Line 398: ---- QUERY
We could have some EE tests with utf8_trim() too.



--
To view, visit http://gerrit.cloudera.org:8080/21825
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I3c4fa6d0d8d0684c4b6d8dac8fd531d205e4f7b4
Gerrit-Change-Number: 21825
Gerrit-PatchSet: 2
Gerrit-Owner: Mihaly Szjatinya <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Comment-Date: Fri, 20 Sep 2024 13:00:15 +0000
Gerrit-HasComments: Yes

Reply via email to