Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/12004 )

Change subject: IMPALA-7889: Write new logical types in Parquet
......................................................................


Patch Set 7:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/12004/7/be/src/exec/parquet/parquet-metadata-utils.cc
File be/src/exec/parquet/parquet-metadata-utils.cc:

http://gerrit.cloudera.org:8080/#/c/12004/7/be/src/exec/parquet/parquet-metadata-utils.cc@158
PS7, Line 158: static bool IsEncodingSupported(parquet::Encoding::type e) {
There's already a "IsSupportedType()" in the anonymous namespace above, I think 
we can move this there, and add a comment to explain what it does.


http://gerrit.cloudera.org:8080/#/c/12004/7/be/src/exec/parquet/parquet-metadata-utils.cc@290
PS7, Line 290: namespace {
I'd consider moving all anonymous helpers up into one anonymous namespace


http://gerrit.cloudera.org:8080/#/c/12004/5/tests/query_test/test_insert_parquet.py
File tests/query_test/test_insert_parquet.py:

http://gerrit.cloudera.org:8080/#/c/12004/5/tests/query_test/test_insert_parquet.py@388
PS5, Line 388:     found = False
> Sorry, I forgot this one in patch set 6.
You could rename _find_schema to _get_schema() if you feel that that would 
express more clearly that it actually has to exist, and then add an assert 
there instead of here (since there doesn't seem to be a case where it doesn't 
exist). I don't feel strongly about this.


http://gerrit.cloudera.org:8080/#/c/12004/7/tests/query_test/test_insert_parquet.py
File tests/query_test/test_insert_parquet.py:

http://gerrit.cloudera.org:8080/#/c/12004/7/tests/query_test/test_insert_parquet.py@37
PS7, Line 37: from tests.util.get_parquet_metadata import decode_stats_value, \
nit: Wrap these in parentheses, and while you're here the ones above, too. (see 
https://www.python.org/dev/peps/pep-0328/)


http://gerrit.cloudera.org:8080/#/c/12004/7/tests/query_test/test_insert_parquet.py@388
PS7, Line 388:     found = False
I think you can shorten this to something like:

 keys = [k for k, v in obj_dict.iteritems() if v is not None]
 assert keys == [var_name]

Or make it one line if you prefer


http://gerrit.cloudera.org:8080/#/c/12004/7/tests/query_test/test_insert_parquet.py@410
PS7, Line 410:       8: ConvertedType.INT_8,
nit: I think we indent 4 spaces here


http://gerrit.cloudera.org:8080/#/c/12004/7/tests/query_test/test_insert_parquet.py@456
PS7, Line 456:     # This test will break once INT64 becomes the default 
Parquet type for TIMESTAMP
Is there actually a Jira we can add here? If not that's ok, too



--
To view, visit http://gerrit.cloudera.org:8080/12004
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91
Gerrit-Change-Number: 12004
Gerrit-PatchSet: 7
Gerrit-Owner: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Comment-Date: Wed, 12 Dec 2018 05:31:50 +0000
Gerrit-HasComments: Yes

Reply via email to