Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18327 )
Change subject: IMPALA-11123: Optimize count(star) for ORC scans ...................................................................... Patch Set 3: (3 comments) http://gerrit.cloudera.org:8080/#/c/18327/3/be/src/exec/hdfs-orc-scanner.cc File be/src/exec/hdfs-orc-scanner.cc: http://gerrit.cloudera.org:8080/#/c/18327/3/be/src/exec/hdfs-orc-scanner.cc@780 PS3, Line 780: int64_t num_rows = static_cast<int64_t>(reader_->getNumberOfRows()); Can you unify this with the Parquet implementation? If I didn't miss something then the only difference is in getting the number of rows - there could be a virtual function like HdfsColumnarScanner::GetNumberOfRowsInFile() http://gerrit.cloudera.org:8080/#/c/18327/3/be/src/exec/hdfs-scanner.cc File be/src/exec/hdfs-scanner.cc: http://gerrit.cloudera.org:8080/#/c/18327/3/be/src/exec/hdfs-scanner.cc@846 PS3, Line 846: !(scan_node->IsZeroSlotTableScan() || scan_node->optimize_count_star()) Having a function like "readsOnlyMetadata()" could be more descriptive. http://gerrit.cloudera.org:8080/#/c/18327/3/be/src/exec/hdfs-scanner.cc@847 PS3, Line 847: || footer_split == split) { Shouldn't we also check row_batches_need_validation_ like in HdfsOrcScanner::GetNextInternal? -- To view, visit http://gerrit.cloudera.org:8080/18327 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0fafa1182f97323aeb9ee39dd4e8ecd418fa6091 Gerrit-Change-Number: 18327 Gerrit-PatchSet: 3 Gerrit-Owner: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Comment-Date: Wed, 23 Mar 2022 15:01:07 +0000 Gerrit-HasComments: Yes