Zoltan Borok-Nagy has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/9693 )

Change subject: IMPALA-5842: Write page index in Parquet files
......................................................................


Patch Set 5:

(16 comments)

Thanks Csaba!
(and I also updated my vimrc... :) )

http://gerrit.cloudera.org:8080/#/c/9693/4/be/src/exec/parquet-column-stats.h
File be/src/exec/parquet-column-stats.h:

http://gerrit.cloudera.org:8080/#/c/9693/4/be/src/exec/parquet-column-stats.h@159
PS4, Line 159:   // If true, min/max values are ascending.
> nit: missing .
Done


http://gerrit.cloudera.org:8080/#/c/9693/4/be/src/exec/parquet-column-stats.h@162
PS4, Line 162:   // if true min/max values are descending.
> nit: missing .
Done


http://gerrit.cloudera.org:8080/#/c/9693/4/be/src/exec/parquet-column-stats.inline.h
File be/src/exec/parquet-column-stats.inline.h:

http://gerrit.cloudera.org:8080/#/c/9693/4/be/src/exec/parquet-column-stats.inline.h@207
PS4, Line 207:   if (prev_page_min_buffer_.IsEmpty()) {
             :     RETURN_IF_ERROR(CopyToBuffer(&prev_page_min_buffer_, 
&prev_page_min_value_));
> nit: long lines
Done


http://gerrit.cloudera.org:8080/#/c/9693/4/tests/query_test/test_parquet_page_index.py
File tests/query_test/test_parquet_page_index.py:

http://gerrit.cloudera.org:8080/#/c/9693/4/tests/query_test/test_parquet_page_index.py@71
PS4, Line 71:             column_index_length)
> nit: not enough indentation
Done


http://gerrit.cloudera.org:8080/#/c/9693/4/tests/query_test/test_parquet_page_index.py@82
PS4, Line 82:             offset_index_length)
> nit: not enough indentation
Done


http://gerrit.cloudera.org:8080/#/c/9693/4/tests/query_test/test_parquet_page_index.py@85
PS4, Line 85:           page_header = get_page_header(parquet_file, 
page_loc.offset,
            :               page_loc.compressed_page_size)
            :           page_headers.append(page_header)
> nit: too much indentation
Done


http://gerrit.cloudera.org:8080/#/c/9693/4/tests/query_test/test_parquet_page_index.py@113
PS4, Line 113:     for null_page, value in zip(null_pages, values):
> Are these prints intentional?
No, jus left them here.


http://gerrit.cloudera.org:8080/#/c/9693/4/tests/query_test/test_parquet_page_index.py@118
PS4, Line 118:         elif ordering == BoundaryOrder.DESCENDING and 
previouse_value is not None:
> Is this print intentional?
Done


http://gerrit.cloudera.org:8080/#/c/9693/4/tests/query_test/test_parquet_page_index.py@120
PS4, Line 120:         previous_value = current_value
> nit: too much indentation
Done


http://gerrit.cloudera.org:8080/#/c/9693/4/tests/query_test/test_parquet_page_index.py@122
PS4, Line 122:
> nit: too much indentation
Done


http://gerrit.cloudera.org:8080/#/c/9693/4/tests/query_test/test_parquet_page_index.py@164
PS4, Line 164:               assert null_page
             :             # Everything is None, no further checks needed.
             :             return
             :
             :           min_value = decode_stats_value(column_info.schema, 
min_value_str)
             :           for null_p
> nit: too much indentation
Done


http://gerrit.cloudera.org:8080/#/c/9693/4/tests/query_test/test_parquet_page_index.py@200
PS4, Line 200:           ).format(qualified_table_name, sorting_column, 
source_table)
> nit: not enough indentation
Done


http://gerrit.cloudera.org:8080/#/c/9693/4/tests/query_test/test_parquet_page_index.py@203
PS4, Line 203:     self._validate_parquet_index(hdfs_path, sorting_column, 
tmpdir)
> nit: not enough indentation
Done


http://gerrit.cloudera.org:8080/#/c/9693/4/tests/query_test/test_parquet_page_index.py@244
PS4, Line 244:
> nit: missing .
Done


http://gerrit.cloudera.org:8080/#/c/9693/4/tests/query_test/test_parquet_page_index.py@249
PS4, Line 249:     self._ctas_table_and_verify_index(vector, unique_database,
> nit: missing .
Done


http://gerrit.cloudera.org:8080/#/c/9693/4/tests/util/get_parquet_metadata.py
File tests/util/get_parquet_metadata.py:

http://gerrit.cloudera.org:8080/#/c/9693/4/tests/util/get_parquet_metadata.py@169
PS4, Line 169: def get_column_index(filename, file_pos, length):
             :   """Reads a ColumnIndex object from a file at the given 
position."""
             :   serialized_column_index = read_serialized_object(filename, 
file_pos, length)
             :   protocol = create_protocol(serialized_column_index)
             :   column_index = ColumnIndex()
             :   column_index.read(protocol)
             :   return column_index
             :
             :
             : def get_page_header(filename, file_pos, length):
             :   """Reads a PageHeader object from a file at the given 
position."""
             :   serialized_page_header = read_serialized_object(filename, 
file_pos, length)
             :   protocol = create_protocol(serialized_page_header)
             :   page_header = PageHeader()
             :   page_header.read(protocol)
             :   return page_header
             :
             :
             : def get_offset_index(filename, file_pos, length):
             :   """Reads an OffsetIndex object from a file at the given 
position."""
             :   serialized_offset_index = read_serialized_object(filename, 
file_pos, length)
             :   protocol = create_protocol(serialized_offset_index)
             :   offset_index = OffsetIndex()
             :   offset_index.read(protocol)
             :   return offset_index
> nit: too much indentation
Done



--
To view, visit http://gerrit.cloudera.org:8080/9693
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: Icbacf7fe3b7672e3ce719261ecef445b16f8dec9
Gerrit-Change-Number: 9693
Gerrit-PatchSet: 5
Gerrit-Owner: Zoltan Borok-Nagy <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>
Gerrit-Comment-Date: Mon, 26 Mar 2018 17:48:47 +0000
Gerrit-HasComments: Yes

Reply via email to