Zach Amsden has posted comments on this change.

Change subject: IMPALA-4729: Implement REPLACE()
......................................................................


Patch Set 2:

(6 comments)

http://gerrit.cloudera.org:8080/#/c/5776/2/be/src/exprs/string-functions-ir.cc
File be/src/exprs/string-functions-ir.cc:

Line 210:   if (pattern.len == 0) return str;
> pattern can be null as well
Yes, unclear what the proper behavior is.  I elected to return the original in 
that case as well (null also has len == 0).


PS2, Line 221: // First pass - find matches to compute output size
> Thinking it a bit more, the answer could be it depends. In general, append(
This depends on how expensive the allocation and potential over-allocation is 
for large strings.  We could greedily grab a large enough buffer for mid-size 
strings (1Mb-ish) and probably want a different strategy with dynamic 
allocation for very large strings or worst case expansions.


Line 225:       if (match_pos_in_substring < 0)
> single line
Done


PS2, Line 237: int rlen = num_matches * (replace.len - needle.len) + str.len;
> Can this overflow with a malicious replacement string ?
Yes, overflow is a real possibility.  Underflow is not.


PS2, Line 249: DCHECK(match_pos_in_substring >= 0);
> DCHECK_GE()
If we know the number of matches already, yes - i < num_matches.  If we change 
this to a one-pass algorithm, the answer is more complicated.


http://gerrit.cloudera.org:8080/#/c/5776/2/fe/src/main/cup/sql-parser.cup
File fe/src/main/cup/sql-parser.cup:

PS2, Line 2955: ident_or_restricted:ident
> Not sure if dotted_path is the right place to modify as it's used at other 
dotted_path is overloaded to be the primary term for function_name - not sure 
this was a perfect choice but it looks hard to change this decision now.


-- 
To view, visit http://gerrit.cloudera.org:8080/5776
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I1780a7d8fee6d0db9dad148217fb6eb10f773329
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Zach Amsden <[email protected]>
Gerrit-Reviewer: Alex Behm <[email protected]>
Gerrit-Reviewer: Dan Hecht <[email protected]>
Gerrit-Reviewer: Michael Ho <[email protected]>
Gerrit-Reviewer: Zach Amsden <[email protected]>
Gerrit-HasComments: Yes

Reply via email to