Zoltan Borok-Nagy has posted comments on this change. ( http://gerrit.cloudera.org:8080/10950 )
Change subject: IMPALA-376: add built-in functions for parsing JSON ...................................................................... Patch Set 9: (12 comments) Thanks for taking this on! It is a very useful functionality IMO. http://gerrit.cloudera.org:8080/#/c/10950/9/be/src/exprs/string-functions.cc File be/src/exprs/string-functions.cc: http://gerrit.cloudera.org:8080/#/c/10950/9/be/src/exprs/string-functions.cc@67 PS9, Line 67: thread_local FunctionContext* UdfJsonAllocator::ctx_ = nullptr; nit: please insert empty line after class definition http://gerrit.cloudera.org:8080/#/c/10950/9/be/src/exprs/string-functions.cc@69 PS9, Line 69: RapidJSON required the input strings be end of '\0'. I don't think it's grammatically correct. Maybe: "RapidJSON requires input strings that end with a trailing '\0'" http://gerrit.cloudera.org:8080/#/c/10950/9/be/src/exprs/string-functions.cc@70 PS9, Line 70: tailing nit: trailing http://gerrit.cloudera.org:8080/#/c/10950/9/be/src/exprs/string-functions.cc@96 PS9, Line 96: src_ nit: read_position_ would be a better name I think http://gerrit.cloudera.org:8080/#/c/10950/9/be/src/exprs/string-functions.cc@166 PS9, Line 166: values.erase Maybe it'd be better to have two vectors and juggling the values between them instead of this erasing and shifting. http://gerrit.cloudera.org:8080/#/c/10950/9/be/src/exprs/string-functions.cc@194 PS9, Line 194: (path[path_idx] != ' ') I am not sure that there is a test for it. Either for the true and false cases. http://gerrit.cloudera.org:8080/#/c/10950/9/be/src/exprs/string-functions.cc@195 PS9, Line 195: return -1; Maybe taking the function context and set some error would be helpful for the user. Since there is a problem with the 'path_str' we can consider it as an error IMO. http://gerrit.cloudera.org:8080/#/c/10950/9/be/src/exprs/string-functions.cc@199 PS9, Line 199: return NULL This function returns -1, the return NULL statement is in GetJsonObjectImpl(), so this comment here is a bit confusing. Maybe it could also set this error in the function context. http://gerrit.cloudera.org:8080/#/c/10950/9/be/src/exprs/string-functions.cc@217 PS9, Line 217: return -1; Error could be set in function context. We could also use 'path_idx' in the error message. http://gerrit.cloudera.org:8080/#/c/10950/9/be/src/exprs/string-functions.cc@251 PS9, Line 251: vector<UdfValue*> values; The C++ standard doesn't specify the initial capacity of std::vector, but it is likely to be zero. Calling values.reserve() with some starting value (maybe 64) could be beneficial. http://gerrit.cloudera.org:8080/#/c/10950/9/be/src/exprs/string-functions.cc@267 PS9, Line 267: (end == nullptr) Set error in function context? http://gerrit.cloudera.org:8080/#/c/10950/9/be/src/util/string-util.h File be/src/util/string-util.h: http://gerrit.cloudera.org:8080/#/c/10950/9/be/src/util/string-util.h@53 PS9, Line 53: AdvanceIdentifier Maybe 'FindIdentifierEnd' or something like that would be a better name. -- To view, visit http://gerrit.cloudera.org:8080/10950 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6a9d3598cb3beca0865a7edb094f3a5b602dbd2f Gerrit-Change-Number: 10950 Gerrit-PatchSet: 9 Gerrit-Owner: Quanlong Huang <[email protected]> Gerrit-Reviewer: Attila Jeges <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Todd Lipcon <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Fri, 17 Aug 2018 14:21:42 +0000 Gerrit-HasComments: Yes
