Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/18327 )

Change subject: IMPALA-11123: Optimize count(star) for ORC scans
......................................................................


Patch Set 9: Code-Review+1

(2 comments)

Thanks for merging more of the ORC and Parquet code!

http://gerrit.cloudera.org:8080/#/c/18327/9/be/src/exec/hdfs-columnar-scanner.cc
File be/src/exec/hdfs-columnar-scanner.cc:

http://gerrit.cloudera.org:8080/#/c/18327/9/be/src/exec/hdfs-columnar-scanner.cc@319
PS9, Line 319:   COUNTER_ADD(scan_node_->rows_read_counter(), num_rows);
> We might want to remove this counter increment to avoid confusion.
I think that we can remove this, there is no real need to be backward 
compatible in profile IMO

To express that we have read the row group / stripe metadata, we could have 
counters like NumRowGroupsMetadataRead - seeing these new NumRowGroups would 
express it clearly that reading the metadata was enough for row groups.


http://gerrit.cloudera.org:8080/#/c/18327/9/tests/query_test/test_aggregation.py
File tests/query_test/test_aggregation.py:

http://gerrit.cloudera.org:8080/#/c/18327/9/tests/query_test/test_aggregation.py@260
PS9, Line 260: test_parquet_count_star_optimization
I agree with moving count_star_optimization tests to a different class.

> Some are also slow, I suspect because they need to create unique_database 
> first, only to be skipped later.

I never thought about this - I can imagine unique_database creation becoming a 
bottleneck in some cases during test runs, as it needs extra insert+delete in 
the database behind HMS. Moving a bunch of pytest.skip() to @SkipIf could be a 
great ramp up task.



--
To view, visit http://gerrit.cloudera.org:8080/18327
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I0fafa1182f97323aeb9ee39dd4e8ecd418fa6091
Gerrit-Change-Number: 18327
Gerrit-PatchSet: 9
Gerrit-Owner: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Qifan Chen <qc...@cloudera.com>
Gerrit-Reviewer: Quanlong Huang <huangquanl...@gmail.com>
Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Comment-Date: Fri, 01 Apr 2022 12:40:18 +0000
Gerrit-HasComments: Yes

Reply via email to