Quanlong Huang has posted comments on this change. ( http://gerrit.cloudera.org:8080/19699 )
Change subject: IMPALA-10798: Prototype a simple JSON File reader ...................................................................... Patch Set 18: (10 comments) http://gerrit.cloudera.org:8080/#/c/19699/18//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19699/18//COMMIT_MSG@23 PS18, Line 23: Could you add a section for the current limitations? E.g. Limitations - multiline json objects are not supported - compressed json files are not supported Does this patch support parsing complex types from json files? http://gerrit.cloudera.org:8080/#/c/19699/18/be/src/exec/json-parser.h File be/src/exec/json-parser.h: http://gerrit.cloudera.org:8080/#/c/19699/18/be/src/exec/json-parser.h@45 PS18, Line 45: begin and end nit: add quotes on var names, i.e. 'begin' and 'end' http://gerrit.cloudera.org:8080/#/c/19699/18/be/src/exec/json-parser.h@49 PS18, Line 49: some nit: remove "some" http://gerrit.cloudera.org:8080/#/c/19699/18/be/src/exec/json-parser.h@68 PS18, Line 68: /// bool AddNumber(int index, const char* str, uint32_t len); nit: it'd be helpful to give a doc link here. E.g. /// See more in https://rapidjson.org/classrapidjson_1_1_handler.html http://gerrit.cloudera.org:8080/#/c/19699/18/be/src/exec/json-parser.h@143 PS18, Line 143: !memcmp(field_found_.data(), field_found_.data() + 1, field_found_.size() - 1))); This is not that readable. Could you add some comments and break them into smaller conditions? http://gerrit.cloudera.org:8080/#/c/19699/18/be/src/exec/json-parser.h@169 PS18, Line 169: rapidjson::kParseStopWhenDoneFlag; What about kParseInsituFlag? Can we use it in our scenario? http://gerrit.cloudera.org:8080/#/c/19699/18/be/src/exec/json-parser.h@230 PS18, Line 230: DCHECK(!IsRequiredField()); nit: This is a bit confusing. It might be better to use DCHECK_EQ(current_field_idx_, -1) directly. http://gerrit.cloudera.org:8080/#/c/19699/18/be/src/exec/json-parser.h@242 PS18, Line 242: bool StartObject() { nit: could you add a comment before each methods mentioning they are interfaces of handler used in RapidJson? e.g. /// Handler methods used in Rapidjson http://gerrit.cloudera.org:8080/#/c/19699/18/be/src/exec/json/hdfs-json-scanner.cc File be/src/exec/json/hdfs-json-scanner.cc: http://gerrit.cloudera.org:8080/#/c/19699/18/be/src/exec/json/hdfs-json-scanner.cc@114 PS18, Line 114: the scan range before nit: "the previous scan range in the same file" might be more clear http://gerrit.cloudera.org:8080/#/c/19699/18/be/src/exec/json/hdfs-json-scanner.cc@160 PS18, Line 160: if( nit: add a space after "if" -- To view, visit http://gerrit.cloudera.org:8080/19699 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I31309cb8f2d04722a0508b3f9b8f1532ad49a569 Gerrit-Change-Number: 19699 Gerrit-PatchSet: 18 Gerrit-Owner: Zihao Ye <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Zihao Ye <[email protected]> Gerrit-Comment-Date: Tue, 18 Jul 2023 13:41:06 +0000 Gerrit-HasComments: Yes
