Zihao Ye has posted comments on this change. ( http://gerrit.cloudera.org:8080/19699 )
Change subject: IMPALA-10798: Prototype a simple JSON File reader ...................................................................... Patch Set 15: (5 comments) Thanks for the review! http://gerrit.cloudera.org:8080/#/c/19699/15//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/19699/15//COMMIT_MSG@11 PS15, Line 11: > Could you summarize the high level design? Done http://gerrit.cloudera.org:8080/#/c/19699/15/testdata/bin/generate-schema-statements.py File testdata/bin/generate-schema-statements.py: http://gerrit.cloudera.org:8080/#/c/19699/15/testdata/bin/generate-schema-statements.py@209 PS15, Line 209: 'json': "JSONFILE" > nit: add a comma at the end Done http://gerrit.cloudera.org:8080/#/c/19699/15/testdata/datasets/functional/schema_constraints.csv File testdata/datasets/functional/schema_constraints.csv: http://gerrit.cloudera.org:8080/#/c/19699/15/testdata/datasets/functional/schema_constraints.csv@198 PS15, Line 198: table_name:decimal_tiny, constraint:restrict_to, table_format:orc/def/block > I think we need to add json here and for several other tables. > Otherwise, some tables are missing in JSON format. Done, but I'm not sure if there are any other tables that need to be added because I'm not familiar with their purpose. http://gerrit.cloudera.org:8080/#/c/19699/15/tests/query_test/test_queries.py File tests/query_test/test_queries.py: http://gerrit.cloudera.org:8080/#/c/19699/15/tests/query_test/test_queries.py@217 PS15, Line 217: class TestQueriesTextTables(ImpalaTestSuite): > Can we add some tests like these for JSON? E.g. tests for > multi-line strings, multi-line json objects, malformed json > objects. Done, if there are any other necessary tests that need to be added, please let me know. http://gerrit.cloudera.org:8080/#/c/19699/15/tests/query_test/test_scanners_fuzz.py File tests/query_test/test_scanners_fuzz.py: http://gerrit.cloudera.org:8080/#/c/19699/15/tests/query_test/test_scanners_fuzz.py@98 PS15, Line 98: elif table_format.file_format in ('rc', 'seq', 'json'): > Why do we skip JSON here? I didn't generate any JSON tables starts with 'decimal_', because of the comment say "Decimal can only be tested on formats Impala can write to (text and parquet)", so I skipped it. Now that we have these tables, the test can pass, so it ?won't be skipped anymore. -- To view, visit http://gerrit.cloudera.org:8080/19699 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I31309cb8f2d04722a0508b3f9b8f1532ad49a569 Gerrit-Change-Number: 19699 Gerrit-PatchSet: 15 Gerrit-Owner: Zihao Ye <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Reviewer: Zihao Ye <[email protected]> Gerrit-Comment-Date: Mon, 26 Jun 2023 11:53:38 +0000 Gerrit-HasComments: Yes
