Hello Quanlong Huang, Csaba Ringhofer, Impala Public Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/18327
to look at the new patch set (#2).
Change subject: IMPALA-11123: Optimize count(star) for ORC scans
......................................................................
IMPALA-11123: Optimize count(star) for ORC scans
IMPALA-5036 added optimization for count(star) in Parquet scans that
avoid materializing dummy rows. This change provides similar
optimization for ORC tables. We use the stripes num rows statistics when
computing the count star instead of materializing empty rows. The
aggregate function changed from a count to a special sum function
initialized to 0.
This count(count) star optimization is disabled for the ACID table
because the scanner might need to read and validate the
'currentTransaction' column. This patch also drops 'parquet' from names
related to the count star optimization.
Testing:
- Add PlannerTest.testOrcStatsAgg
- Add TestAggregationQueries::test_orc_count_star_optimization
- Pass core tests
Change-Id: I0fafa1182f97323aeb9ee39dd4e8ecd418fa6091
---
M be/src/exec/hdfs-orc-scanner.cc
M be/src/exec/hdfs-orc-scanner.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/orc-stats-agg.test
A testdata/workloads/functional-query/queries/QueryTest/orc-stats-agg.test
M tests/query_test/test_aggregation.py
M tests/query_test/test_scanners.py
12 files changed, 626 insertions(+), 47 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/18327/2
--
To view, visit http://gerrit.cloudera.org:8080/18327
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I0fafa1182f97323aeb9ee39dd4e8ecd418fa6091
Gerrit-Change-Number: 18327
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>