Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/10950 )

Change subject: IMPALA-376: add built-in functions for parsing JSON
......................................................................


Patch Set 9:

(12 comments)

Thanks for taking this on! It is a very useful functionality IMO.

http://gerrit.cloudera.org:8080/#/c/10950/9/be/src/exprs/string-functions.cc
File be/src/exprs/string-functions.cc:

http://gerrit.cloudera.org:8080/#/c/10950/9/be/src/exprs/string-functions.cc@67
PS9, Line 67: thread_local FunctionContext* UdfJsonAllocator::ctx_ = nullptr;
nit: please insert empty line after class definition


http://gerrit.cloudera.org:8080/#/c/10950/9/be/src/exprs/string-functions.cc@69
PS9, Line 69: RapidJSON required the input strings be end of '\0'.
I don't think it's grammatically correct. Maybe:
"RapidJSON requires input strings that end with a trailing '\0'"


http://gerrit.cloudera.org:8080/#/c/10950/9/be/src/exprs/string-functions.cc@70
PS9, Line 70: tailing
nit: trailing


http://gerrit.cloudera.org:8080/#/c/10950/9/be/src/exprs/string-functions.cc@96
PS9, Line 96: src_
nit: read_position_ would be a better name I think


http://gerrit.cloudera.org:8080/#/c/10950/9/be/src/exprs/string-functions.cc@166
PS9, Line 166: values.erase
Maybe it'd be better to have two vectors and juggling the values between them 
instead of this erasing and shifting.


http://gerrit.cloudera.org:8080/#/c/10950/9/be/src/exprs/string-functions.cc@194
PS9, Line 194: (path[path_idx] != ' ')
I am not sure that there is a test for it. Either for the true and false cases.


http://gerrit.cloudera.org:8080/#/c/10950/9/be/src/exprs/string-functions.cc@195
PS9, Line 195: return -1;
Maybe taking the function context and set some error would be helpful for the 
user.

Since there is a problem with the 'path_str' we can consider it as an error IMO.


http://gerrit.cloudera.org:8080/#/c/10950/9/be/src/exprs/string-functions.cc@199
PS9, Line 199: return NULL
This function returns -1, the return NULL statement is in GetJsonObjectImpl(), 
so this comment here is a bit confusing.

Maybe it could also set this error in the function context.


http://gerrit.cloudera.org:8080/#/c/10950/9/be/src/exprs/string-functions.cc@217
PS9, Line 217: return -1;
Error could be set in function context. We could also use 'path_idx' in the 
error message.


http://gerrit.cloudera.org:8080/#/c/10950/9/be/src/exprs/string-functions.cc@251
PS9, Line 251:   vector<UdfValue*> values;
The C++ standard doesn't specify the initial capacity of std::vector, but it is 
likely to be zero.

Calling values.reserve() with some starting value (maybe 64) could be 
beneficial.


http://gerrit.cloudera.org:8080/#/c/10950/9/be/src/exprs/string-functions.cc@267
PS9, Line 267: (end == nullptr)
Set error in function context?


http://gerrit.cloudera.org:8080/#/c/10950/9/be/src/util/string-util.h
File be/src/util/string-util.h:

http://gerrit.cloudera.org:8080/#/c/10950/9/be/src/util/string-util.h@53
PS9, Line 53: AdvanceIdentifier
Maybe 'FindIdentifierEnd' or something like that would be a better name.



--
To view, visit http://gerrit.cloudera.org:8080/10950
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6a9d3598cb3beca0865a7edb094f3a5b602dbd2f
Gerrit-Change-Number: 10950
Gerrit-PatchSet: 9
Gerrit-Owner: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Attila Jeges <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Todd Lipcon <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Fri, 17 Aug 2018 14:21:42 +0000
Gerrit-HasComments: Yes

Reply via email to