Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/12004 )
Change subject: IMPALA-7889: Write new logical types in Parquet ...................................................................... Patch Set 7: (7 comments) http://gerrit.cloudera.org:8080/#/c/12004/7/be/src/exec/parquet/parquet-metadata-utils.cc File be/src/exec/parquet/parquet-metadata-utils.cc: http://gerrit.cloudera.org:8080/#/c/12004/7/be/src/exec/parquet/parquet-metadata-utils.cc@158 PS7, Line 158: static bool IsEncodingSupported(parquet::Encoding::type e) { There's already a "IsSupportedType()" in the anonymous namespace above, I think we can move this there, and add a comment to explain what it does. http://gerrit.cloudera.org:8080/#/c/12004/7/be/src/exec/parquet/parquet-metadata-utils.cc@290 PS7, Line 290: namespace { I'd consider moving all anonymous helpers up into one anonymous namespace http://gerrit.cloudera.org:8080/#/c/12004/5/tests/query_test/test_insert_parquet.py File tests/query_test/test_insert_parquet.py: http://gerrit.cloudera.org:8080/#/c/12004/5/tests/query_test/test_insert_parquet.py@388 PS5, Line 388: found = False > Sorry, I forgot this one in patch set 6. You could rename _find_schema to _get_schema() if you feel that that would express more clearly that it actually has to exist, and then add an assert there instead of here (since there doesn't seem to be a case where it doesn't exist). I don't feel strongly about this. http://gerrit.cloudera.org:8080/#/c/12004/7/tests/query_test/test_insert_parquet.py File tests/query_test/test_insert_parquet.py: http://gerrit.cloudera.org:8080/#/c/12004/7/tests/query_test/test_insert_parquet.py@37 PS7, Line 37: from tests.util.get_parquet_metadata import decode_stats_value, \ nit: Wrap these in parentheses, and while you're here the ones above, too. (see https://www.python.org/dev/peps/pep-0328/) http://gerrit.cloudera.org:8080/#/c/12004/7/tests/query_test/test_insert_parquet.py@388 PS7, Line 388: found = False I think you can shorten this to something like: keys = [k for k, v in obj_dict.iteritems() if v is not None] assert keys == [var_name] Or make it one line if you prefer http://gerrit.cloudera.org:8080/#/c/12004/7/tests/query_test/test_insert_parquet.py@410 PS7, Line 410: 8: ConvertedType.INT_8, nit: I think we indent 4 spaces here http://gerrit.cloudera.org:8080/#/c/12004/7/tests/query_test/test_insert_parquet.py@456 PS7, Line 456: # This test will break once INT64 becomes the default Parquet type for TIMESTAMP Is there actually a Jira we can add here? If not that's ok, too -- To view, visit http://gerrit.cloudera.org:8080/12004 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I6f377950845683ab9c6dea79f4c54db0359d0b91 Gerrit-Change-Number: 12004 Gerrit-PatchSet: 7 Gerrit-Owner: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Impala Public Jenkins <[email protected]> Gerrit-Reviewer: Lars Volker <[email protected]> Gerrit-Comment-Date: Wed, 12 Dec 2018 05:31:50 +0000 Gerrit-HasComments: Yes
