Lars Volker has posted comments on this change. Change subject: IMPALA-5210: Count rows and collection items in parquet scanner separately ......................................................................
Patch Set 2: (9 comments) http://gerrit.cloudera.org:8080/#/c/7776/2/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: Line 963: int64_t num_rows_read = 0, num_coll_items_read = 0; We usually limit declarations to one per line per our style guide. Line 964: auto update_read_count = MakeScopeExitTrigger([&] { I find that this construct makes the code harder to reason about. Now someone reading it has to keep in mind that additional code will be executed when the function returns. Can you instead update the counters when CommitRows gets called below? Alternatively you could change the "return" statement in the while loop to a break; and then update the counters before the final return. PS2, Line 978: continue_execution initialize Line 1210: CollectionValueBuilder* coll_value_builder, int64_t* num_coll_item_read) { Can you remove these from the function signatures (see my comment in the header). Line 1277: (*num_coll_item_read) += rows_read; are the () needed here? http://gerrit.cloudera.org:8080/#/c/7776/2/be/src/exec/hdfs-parquet-scanner.h File be/src/exec/hdfs-parquet-scanner.h: Line 475: RuntimeProfile::Counter* num_dict_filtered_row_groups_counter_; Can you do the counting using a member in the scanner instead of passing it through all ReadValue*() functions? That seems to keep these function signatures more clean. http://gerrit.cloudera.org:8080/#/c/7776/2/be/src/exec/parquet-column-readers.cc File be/src/exec/parquet-column-readers.cc: PS2, Line 242: /*num_coll_item_read*/ We usually still name parameters the same, even if some methods don't use them (e.g. the pool parameter in some of these methods). http://gerrit.cloudera.org:8080/#/c/7776/2/be/src/exec/parquet-column-readers.h File be/src/exec/parquet-column-readers.h: PS2, Line 505: removing the virtual keyword breaks consistency with the other methods in this class. PS2, Line 546: 4 ? -- To view, visit http://gerrit.cloudera.org:8080/7776 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I7f6efddaea18507482940f5bdab7326b6482b067 Gerrit-PatchSet: 2 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Tianyi Wang <[email protected]> Gerrit-Reviewer: Lars Volker <[email protected]> Gerrit-HasComments: Yes
