Lars Volker has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9693 )

Change subject: IMPALA-5842: Write page index in Parquet files
......................................................................


Patch Set 17:

(5 comments)

Only a few minor comments, otherwise looks good to me.

http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/util/string-util-test.cc
File be/src/util/string-util-test.cc:

http://gerrit.cloudera.org:8080/#/c/9693/15/be/src/util/string-util-test.cc@68
PS15, Line 68:   EvalTruncation(normal_plus_max, "abcdeg", 10, UP);
> I had a similar test, when I used the '127' literal.
Thx. I don't feel strongly about using decimal vs hex literals. What I was 
aiming for was the last byte being uchar_max, and the one before being 
schar_max. LGTM now.


http://gerrit.cloudera.org:8080/#/c/9693/17/tests/query_test/test_parquet_page_index.py
File tests/query_test/test_parquet_page_index.py:

http://gerrit.cloudera.org:8080/#/c/9693/17/tests/query_test/test_parquet_page_index.py@71
PS17, Line 71:         column_index = read_serialized_object(ColumnIndex, 
parquet_file,
It might make things faster / more elegant to pass an open file here, otherwise 
you're opening the same file 3 times for each column. Feel free to ignore 
though as this is not performance critical code.


http://gerrit.cloudera.org:8080/#/c/9693/17/tests/query_test/test_parquet_page_index.py@126
PS17, Line 126: page_is_null is False
Why not "not page_is_null"?


http://gerrit.cloudera.org:8080/#/c/9693/17/tests/query_test/test_parquet_page_index.py@179
PS17, Line 179: reverse
"reverse" seems to rely on everyone thinking of ascending as the default. Maybe 
rename parameter to "ascending" or "descending"? Or even pass BoundaryOrder, 
and implement UNORDERED as

  return not is_sorted(, ASCENDING) and not is_sorted(l, DESCENDING)

?


http://gerrit.cloudera.org:8080/#/c/9693/17/tests/util/get_parquet_metadata.py
File tests/util/get_parquet_metadata.py:

http://gerrit.cloudera.org:8080/#/c/9693/17/tests/util/get_parquet_metadata.py@163
PS17, Line 163:     metadata = FileMetaData()
If you follow the suggestion in the other file to pass an open file to 
read_serialized_object, you could use it here, too.



--
To view, visit http://gerrit.cloudera.org:8080/9693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icbacf7fe3b7672e3ce719261ecef445b16f8dec9
Gerrit-Change-Number: 9693
Gerrit-PatchSet: 17
Gerrit-Owner: Zoltan Borok-Nagy <[email protected]>
Gerrit-Reviewer: Anonymous Coward #248
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Mon, 14 May 2018 19:44:37 +0000
Gerrit-HasComments: Yes

Reply via email to