Zach Amsden has posted comments on this change. Change subject: IMPALA-5036: Parquet count star optimization ......................................................................
Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/6812/3/be/src/exec/hdfs-parquet-scanner.cc File be/src/exec/hdfs-parquet-scanner.cc: Line 445: *dst_slot = file_metadata_.row_groups[row_group_idx_].num_rows; Bounds check against file_metadata_.num_rows (i.e. keep a running counter as below in row_group_rows_read_ and DCHECK_LE(row_group_rows_read_, file_metatdata_.num_rows)? Line 454: if (scan_node_->IsZeroSlotTableScan()) { Why is this optimization not redundant now? Maybe update the comment to indicate the type of query this now will operate on: e.g., select 1 from table http://gerrit.cloudera.org:8080/#/c/6812/3/common/thrift/PlanNodes.thrift File common/thrift/PlanNodes.thrift: Line 226: 11: optional i64 parquet_count_star_slot_offset > i32 right? Would it be simpler to have this be one parameter and indicate truth by passing a value >= 0? -- To view, visit http://gerrit.cloudera.org:8080/6812 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-MessageType: comment Gerrit-Change-Id: I536b85c014821296aed68a0c68faadae96005e62 Gerrit-PatchSet: 3 Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-Owner: Taras Bobrovytsky <[email protected]> Gerrit-Reviewer: Alex Behm <[email protected]> Gerrit-Reviewer: Lars Volker <[email protected]> Gerrit-Reviewer: Marcel Kornacker <[email protected]> Gerrit-Reviewer: Mostafa Mokhtar <[email protected]> Gerrit-Reviewer: Taras Bobrovytsky <[email protected]> Gerrit-Reviewer: Zach Amsden <[email protected]> Gerrit-HasComments: Yes
