Lars Volker has posted comments on this change.

Change subject: IMPALA-5210: Count rows and collection items in parquet scanner 
separately
......................................................................


Patch Set 2:

(9 comments)

http://gerrit.cloudera.org:8080/#/c/7776/2/be/src/exec/hdfs-parquet-scanner.cc
File be/src/exec/hdfs-parquet-scanner.cc:

Line 963:   int64_t num_rows_read = 0, num_coll_items_read = 0;
We usually limit declarations to one per line per our style guide.


Line 964:   auto update_read_count = MakeScopeExitTrigger([&] {
I find that this construct makes the code harder to reason about. Now someone 
reading it has to keep in mind that additional code will be executed when the 
function returns. Can you instead update the counters when CommitRows gets 
called below? Alternatively you could change the "return" statement in the 
while loop to a break; and then update the counters before the final return.


PS2, Line 978: continue_execution
initialize


Line 1210:     CollectionValueBuilder* coll_value_builder, int64_t* 
num_coll_item_read) {
Can you remove these from the function signatures (see my comment in the 
header).


Line 1277:   (*num_coll_item_read) += rows_read;
are the () needed here?


http://gerrit.cloudera.org:8080/#/c/7776/2/be/src/exec/hdfs-parquet-scanner.h
File be/src/exec/hdfs-parquet-scanner.h:

Line 475:   RuntimeProfile::Counter* num_dict_filtered_row_groups_counter_;
Can you do the counting using a member in the scanner instead of passing it 
through all ReadValue*() functions? That seems to keep these function 
signatures more clean.


http://gerrit.cloudera.org:8080/#/c/7776/2/be/src/exec/parquet-column-readers.cc
File be/src/exec/parquet-column-readers.cc:

PS2, Line 242: /*num_coll_item_read*/
We usually still name parameters the same, even if some methods don't use them 
(e.g. the pool parameter in some of these methods).


http://gerrit.cloudera.org:8080/#/c/7776/2/be/src/exec/parquet-column-readers.h
File be/src/exec/parquet-column-readers.h:

PS2, Line 505: 
removing the virtual keyword breaks consistency with the other methods in this 
class.


PS2, Line 546: 4
?


-- 
To view, visit http://gerrit.cloudera.org:8080/7776
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I7f6efddaea18507482940f5bdab7326b6482b067
Gerrit-PatchSet: 2
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Tianyi Wang <[email protected]>
Gerrit-Reviewer: Lars Volker <[email protected]>
Gerrit-HasComments: Yes

Reply via email to