Michael Smith has posted comments on this change. ( http://gerrit.cloudera.org:8080/20506 )
Change subject: IMPALA-12465: Unicode column name support ...................................................................... Patch Set 5: (10 comments) http://gerrit.cloudera.org:8080/#/c/20506/5//COMMIT_MSG Commit Message: http://gerrit.cloudera.org:8080/#/c/20506/5//COMMIT_MSG@17 PS5, Line 17: This is just a primitive patch which will need a lot of testing This sentence can be removed as the patch adds testing. Normally I would prefix this by TODO to imply this is an in-progress commit and I intend to finish this before applying it. http://gerrit.cloudera.org:8080/#/c/20506/5/testdata/datasets/functional/functional_schema_template.sql File testdata/datasets/functional/functional_schema_template.sql: http://gerrit.cloudera.org:8080/#/c/20506/5/testdata/datasets/functional/functional_schema_template.sql@4312 PS5, Line 4312: alltypesunicode This table seems to be unused. http://gerrit.cloudera.org:8080/#/c/20506/5/testdata/workloads/functional-query/queries/QueryTest/create-table-unicode.test File testdata/workloads/functional-query/queries/QueryTest/create-table-unicode.test: http://gerrit.cloudera.org:8080/#/c/20506/5/testdata/workloads/functional-query/queries/QueryTest/create-table-unicode.test@3 PS5, Line 3: create table $DATABASE.testtbl1(`我` int, s string COMMENT 'String col') stored as TEXTFILE; Do these names cover multi-byte characters? I'd suggest have a column name with more than one glyph too. http://gerrit.cloudera.org:8080/#/c/20506/5/testdata/workloads/functional-query/queries/QueryTest/create-table-unicode.test@50 PS5, Line 50: create table $DATABASE.testtbl_kudu(`我` int, s string COMMENT 'String col', Primary key(`我`)) stored as KUDU; I think test dimensions across different tables can take care of testing all the different "stored as" variations so you don't need to repeat them all. The only example I can find is using FILE_FORMAT_TO_STORED_AS_MAP (e.g. in test_queries.py). So you may need to move creation out of this file. http://gerrit.cloudera.org:8080/#/c/20506/5/testdata/workloads/functional-query/queries/QueryTest/create-table.test File testdata/workloads/functional-query/queries/QueryTest/create-table.test: http://gerrit.cloudera.org:8080/#/c/20506/5/testdata/workloads/functional-query/queries/QueryTest/create-table.test@362 PS5, Line 362: ==== I'd prefer not to modify this file if there's no substantial change. You can reset it to what it was before your commit with git reset HEAD~1 -- testdata/workloads/functional-query/queries/QueryTest/create-table.test http://gerrit.cloudera.org:8080/#/c/20506/5/tests/metadata/column-unicode.py File tests/metadata/column-unicode.py: http://gerrit.cloudera.org:8080/#/c/20506/5/tests/metadata/column-unicode.py@3 PS5, Line 3: c > flake8: E501 line too long (147 > 90 characters) flake8 is a Python linter, it's highlighting issues that'll need to be fixed before we'll merge. This line can be split to something like from tests.common.test_dimensions import (create_exec_option_dimension, create_client_protocol_dimension, create_client_protocol_no_strict_dimension) You can run flake8 locally on a file with "impala-flake8 <file path>". http://gerrit.cloudera.org:8080/#/c/20506/5/tests/metadata/column-unicode.py@6 PS5, Line 6: class TestUnicode(ImpalaTestSuite): For test naming, file and test class should be similar and match others in the file. So I'd update them to "test_column_unicode.py" and "TestColumnUnicode". http://gerrit.cloudera.org:8080/#/c/20506/5/tests/metadata/column-unicode.py@12 PS5, Line 12: def add_test_dimensions(cls): If these are quick tests that apply to all table formats, I would omit overriding add_test_dimensions entirely. If you do need to override it, provide comments explaining why you made these choices. http://gerrit.cloudera.org:8080/#/c/20506/5/tests/shell/test_shell_interactive.py File tests/shell/test_shell_interactive.py: http://gerrit.cloudera.org:8080/#/c/20506/5/tests/shell/test_shell_interactive.py@389 PS5, Line 389: args = ("create table {0}.test_tbl(`我` int, s string COMMENT 'String col') STORED AS TEXTFILE;" This feels like a different test. Please split it into a new function named something like test_unicode_columns. http://gerrit.cloudera.org:8080/#/c/20506/5/tests/shell/test_shell_interactive.py@404 PS5, Line 404: assert "Fetched 1 row(s)" in result.stderr, result.stderr Can we also validate the output column headers are formatted correctly? Should be part of result.stdout. -- To view, visit http://gerrit.cloudera.org:8080/20506 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I1ad9d63ac1b9631a0f4a433798bd5109aa2ed718 Gerrit-Change-Number: 20506 Gerrit-PatchSet: 5 Gerrit-Owner: Anonymous Coward <[email protected]> Gerrit-Reviewer: Anonymous Coward <[email protected]> Gerrit-Reviewer: Daniel Becker <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Michael Smith <[email protected]> Gerrit-Reviewer: Quanlong Huang <[email protected]> Gerrit-Comment-Date: Tue, 28 Nov 2023 00:17:25 +0000 Gerrit-HasComments: Yes
