Riza Suminto has posted comments on this change. ( http://gerrit.cloudera.org:8080/21039 )
Change subject: IMPALA-12786: Optimize count(*) for JSON scans ...................................................................... Patch Set 4: (5 comments) Thank you for filing this patch. I have some questions and comments. http://gerrit.cloudera.org:8080/#/c/21039/4/be/src/exec/json/json-parser.cc File be/src/exec/json/json-parser.cc: http://gerrit.cloudera.org:8080/#/c/21039/4/be/src/exec/json/json-parser.cc@37 PS4, Line 37: This class is : /// essentially a simplified version of a rapidjson parser, removing specific data parsing : /// and copying operations, allowing for faster parsing of the number of JSON objects. Can you put reference link to rapidjson code where this class is based from? http://gerrit.cloudera.org:8080/#/c/21039/4/be/src/exec/json/json-parser.cc@45 PS4, Line 45: class JsonSkipper { Can you add more tests for this class? Please test for all possible positive cases where skipping successful and negative cases where skipping fail and return error. http://gerrit.cloudera.org:8080/#/c/21039/4/be/src/exec/json/json-parser.cc@126 PS4, Line 126: bool SkipNumber() { : Consume('-'); Can you add comment about valid number literal? What if it begin with '.' or '+' ? http://gerrit.cloudera.org:8080/#/c/21039/4/be/src/exec/json/json-parser.cc@182 PS4, Line 182: case 'n': RETURN_IF_FALSE(SkipNull()); break; : case 't': RETURN_IF_FALSE(SkipTrue()); break; : case 'f': RETURN_IF_FALSE(SkipFalse()); break; Is s_ always lower case? http://gerrit.cloudera.org:8080/#/c/21039/4/testdata/workloads/functional-query/queries/QueryTest/malformed_json.test File testdata/workloads/functional-query/queries/QueryTest/malformed_json.test: http://gerrit.cloudera.org:8080/#/c/21039/4/testdata/workloads/functional-query/queries/QueryTest/malformed_json.test@27 PS4, Line 27: select count(*) from malformed_json : ---- TYPES : bigint : ---- RESULTS : 13 Is the same query yield the same result before this patch? -- To view, visit http://gerrit.cloudera.org:8080/21039 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I97ff097661c3c577aeafeeb1518408ce7a8a255e Gerrit-Change-Number: 21039 Gerrit-PatchSet: 4 Gerrit-Owner: Zihao Ye <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Riza Suminto <[email protected]> Gerrit-Comment-Date: Thu, 21 Mar 2024 03:58:30 +0000 Gerrit-HasComments: Yes
