Lars Volker has posted comments on this change. ( http://gerrit.cloudera.org:8080/9693 )
Change subject: IMPALA-5842: Write page index in Parquet files ...................................................................... Patch Set 17: (5 comments) Only a few minor comments, otherwise looks good to me. http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/util/string-util-test.cc File be/src/util/string-util-test.cc: http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/util/string-util-test.cc@68 PS15, Line 68: EvalTruncation(normal_plus_max, "abcdeg", 10, UP); > I had a similar test, when I used the '127' literal. Thx. I don't feel strongly about using decimal vs hex literals. What I was aiming for was the last byte being uchar_max, and the one before being schar_max. LGTM now. http://gerrit.cloudera.org:8080/#/c/9693/17/tests/query_test/test_parquet_page_index.py File tests/query_test/test_parquet_page_index.py: http://gerrit.cloudera.org:8080/#/c/9693/17/tests/query_test/test_parquet_page_index.py@71 PS17, Line 71: column_index = read_serialized_object(ColumnIndex, parquet_file, It might make things faster / more elegant to pass an open file here, otherwise you're opening the same file 3 times for each column. Feel free to ignore though as this is not performance critical code. http://gerrit.cloudera.org:8080/#/c/9693/17/tests/query_test/test_parquet_page_index.py@126 PS17, Line 126: page_is_null is False Why not "not page_is_null"? http://gerrit.cloudera.org:8080/#/c/9693/17/tests/query_test/test_parquet_page_index.py@179 PS17, Line 179: reverse "reverse" seems to rely on everyone thinking of ascending as the default. Maybe rename parameter to "ascending" or "descending"? Or even pass BoundaryOrder, and implement UNORDERED as return not is_sorted(, ASCENDING) and not is_sorted(l, DESCENDING) ? http://gerrit.cloudera.org:8080/#/c/9693/17/tests/util/get_parquet_metadata.py File tests/util/get_parquet_metadata.py: http://gerrit.cloudera.org:8080/#/c/9693/17/tests/util/get_parquet_metadata.py@163 PS17, Line 163: metadata = FileMetaData() If you follow the suggestion in the other file to pass an open file to read_serialized_object, you could use it here, too. -- To view, visit http://gerrit.cloudera.org:8080/9693 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: Icbacf7fe3b7672e3ce719261ecef445b16f8dec9 Gerrit-Change-Number: 9693 Gerrit-PatchSet: 17 Gerrit-Owner: Zoltan Borok-Nagy <[email protected]> Gerrit-Reviewer: Anonymous Coward #248 Gerrit-Reviewer: Csaba Ringhofer <[email protected]> Gerrit-Reviewer: Lars Volker <[email protected]> Gerrit-Reviewer: Tim Armstrong <[email protected]> Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]> Gerrit-Comment-Date: Mon, 14 May 2018 19:44:37 +0000 Gerrit-HasComments: Yes
