Michael Smith has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/20506 )

Change subject: IMPALA-12465: Unicode column name support
......................................................................


Patch Set 5:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/20506/5//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/20506/5//COMMIT_MSG@17
PS5, Line 17: This is just a primitive patch which will need a lot of testing
This sentence can be removed as the patch adds testing. Normally I would prefix 
this by TODO to imply this is an in-progress commit and I intend to finish this 
before applying it.


http://gerrit.cloudera.org:8080/#/c/20506/5/testdata/datasets/functional/functional_schema_template.sql
File testdata/datasets/functional/functional_schema_template.sql:

http://gerrit.cloudera.org:8080/#/c/20506/5/testdata/datasets/functional/functional_schema_template.sql@4312
PS5, Line 4312: alltypesunicode
This table seems to be unused.


http://gerrit.cloudera.org:8080/#/c/20506/5/testdata/workloads/functional-query/queries/QueryTest/create-table-unicode.test
File 
testdata/workloads/functional-query/queries/QueryTest/create-table-unicode.test:

http://gerrit.cloudera.org:8080/#/c/20506/5/testdata/workloads/functional-query/queries/QueryTest/create-table-unicode.test@3
PS5, Line 3: create table $DATABASE.testtbl1(`我` int, s string COMMENT 'String 
col') stored as TEXTFILE;
Do these names cover multi-byte characters? I'd suggest have a column name with 
more than one glyph too.


http://gerrit.cloudera.org:8080/#/c/20506/5/testdata/workloads/functional-query/queries/QueryTest/create-table-unicode.test@50
PS5, Line 50: create table $DATABASE.testtbl_kudu(`我` int, s string COMMENT 
'String col', Primary key(`我`)) stored as KUDU;
I think test dimensions across different tables can take care of testing all 
the different "stored as" variations so you don't need to repeat them all.

The only example I can find is using FILE_FORMAT_TO_STORED_AS_MAP (e.g. in 
test_queries.py). So you may need to move creation out of this file.


http://gerrit.cloudera.org:8080/#/c/20506/5/testdata/workloads/functional-query/queries/QueryTest/create-table.test
File testdata/workloads/functional-query/queries/QueryTest/create-table.test:

http://gerrit.cloudera.org:8080/#/c/20506/5/testdata/workloads/functional-query/queries/QueryTest/create-table.test@362
PS5, Line 362: ====
I'd prefer not to modify this file if there's no substantial change. You can 
reset it to what it was before your commit with

  git reset HEAD~1 -- 
testdata/workloads/functional-query/queries/QueryTest/create-table.test


http://gerrit.cloudera.org:8080/#/c/20506/5/tests/metadata/column-unicode.py
File tests/metadata/column-unicode.py:

http://gerrit.cloudera.org:8080/#/c/20506/5/tests/metadata/column-unicode.py@3
PS5, Line 3: c
> flake8: E501 line too long (147 > 90 characters)
flake8 is a Python linter, it's highlighting issues that'll need to be fixed 
before we'll merge. This line can be split to something like

  from tests.common.test_dimensions import (create_exec_option_dimension,
      create_client_protocol_dimension, 
create_client_protocol_no_strict_dimension)

You can run flake8 locally on a file with "impala-flake8 <file path>".


http://gerrit.cloudera.org:8080/#/c/20506/5/tests/metadata/column-unicode.py@6
PS5, Line 6: class TestUnicode(ImpalaTestSuite):
For test naming, file and test class should be similar and match others in the 
file. So I'd update them to "test_column_unicode.py" and "TestColumnUnicode".


http://gerrit.cloudera.org:8080/#/c/20506/5/tests/metadata/column-unicode.py@12
PS5, Line 12:     def add_test_dimensions(cls):
If these are quick tests that apply to all table formats, I would omit 
overriding add_test_dimensions entirely. If you do need to override it, provide 
comments explaining why you made these choices.


http://gerrit.cloudera.org:8080/#/c/20506/5/tests/shell/test_shell_interactive.py
File tests/shell/test_shell_interactive.py:

http://gerrit.cloudera.org:8080/#/c/20506/5/tests/shell/test_shell_interactive.py@389
PS5, Line 389:     args = ("create table {0}.test_tbl(`我` int, s string COMMENT 
'String col') STORED AS TEXTFILE;"
This feels like a different test. Please split it into a new function named 
something like test_unicode_columns.


http://gerrit.cloudera.org:8080/#/c/20506/5/tests/shell/test_shell_interactive.py@404
PS5, Line 404:     assert "Fetched 1 row(s)" in result.stderr, result.stderr
Can we also validate the output column headers are formatted correctly? Should 
be part of result.stdout.



--
To view, visit http://gerrit.cloudera.org:8080/20506
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1ad9d63ac1b9631a0f4a433798bd5109aa2ed718
Gerrit-Change-Number: 20506
Gerrit-PatchSet: 5
Gerrit-Owner: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Daniel Becker <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Comment-Date: Tue, 28 Nov 2023 00:17:25 +0000
Gerrit-HasComments: Yes

Reply via email to