Csaba Ringhofer has posted comments on this change. ( http://gerrit.cloudera.org:8080/18327 )
Change subject: IMPALA-11123: Optimize count(star) for ORC scans ...................................................................... Patch Set 9: Code-Review+1 (2 comments) Thanks for merging more of the ORC and Parquet code! http://gerrit.cloudera.org:8080/#/c/18327/9/be/src/exec/hdfs-columnar-scanner.cc File be/src/exec/hdfs-columnar-scanner.cc: http://gerrit.cloudera.org:8080/#/c/18327/9/be/src/exec/hdfs-columnar-scanner.cc@319 PS9, Line 319: COUNTER_ADD(scan_node_->rows_read_counter(), num_rows); > We might want to remove this counter increment to avoid confusion. I think that we can remove this, there is no real need to be backward compatible in profile IMO To express that we have read the row group / stripe metadata, we could have counters like NumRowGroupsMetadataRead - seeing these new NumRowGroups would express it clearly that reading the metadata was enough for row groups. http://gerrit.cloudera.org:8080/#/c/18327/9/tests/query_test/test_aggregation.py File tests/query_test/test_aggregation.py: http://gerrit.cloudera.org:8080/#/c/18327/9/tests/query_test/test_aggregation.py@260 PS9, Line 260: test_parquet_count_star_optimization I agree with moving count_star_optimization tests to a different class. > Some are also slow, I suspect because they need to create unique_database > first, only to be skipped later. I never thought about this - I can imagine unique_database creation becoming a bottleneck in some cases during test runs, as it needs extra insert+delete in the database behind HMS. Moving a bunch of pytest.skip() to @SkipIf could be a great ramp up task. -- To view, visit http://gerrit.cloudera.org:8080/18327 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: comment Gerrit-Change-Id: I0fafa1182f97323aeb9ee39dd4e8ecd418fa6091 Gerrit-Change-Number: 18327 Gerrit-PatchSet: 9 Gerrit-Owner: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com> Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com> Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com> Gerrit-Comment-Date: Fri, 01 Apr 2022 12:40:18 +0000 Gerrit-HasComments: Yes