Hello Quanlong Huang, David Rorke, Csaba Ringhofer, Impala Public Jenkins, 

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/19927

to look at the new patch set (#2).

Change subject: IMPALA-11123: Reimplement ORC optimized count star
......................................................................

IMPALA-11123: Reimplement ORC optimized count star

Commit 7ca20b3c94b1c9c1ddd4ed1e89f0969a0df55330 revert the original
optimized count(star) for ORC scan from commit
f932d78ad0a30e322d59fc39072f710f889d2135 (gerrit review
http://gerrit.cloudera.org:8080/18327). The revert is necessary since
the unification of count star and zero slot functions into
HdfsColumnarScanner and causing significant regression for non-optimized
counts star query in parquet format (over 15% slower
MaterializeTupleTime).

This patch reimplements optimized count(star) for ORC scan code path
while minimizing the code changes needed for parquet scan code path.
After this patch, ORC and parquet code path will have only the following
new things in common:
- THdfsScanNode.count_star_slot_offset renamed to
  THdfsScanNode.star_slot_offset
- Static function HdfsScanner::IssueFooterRanges adds one function
  parameter is_read_metadata_only. Parquet scan will pass
  ReadsParquetMetadataOnly function as argument for
  is_read_metadata_only parameter, while ORC scan will pass
  ReadsOrcMetadataOnly function. ReadsParquetMetadataOnly only checks
  for HdfsScanNodeBase::IsZeroSlotTableScan(), thus behavior remains
  unchanged for parquet scan code path. On the other hand,
  ReadsOrcMetadataOnly is more elaborate by additionally checking for
  HdfsScanNodeBase::optimize_count_star() and Hive ACID compaction
  status.

The structure of HdfsParquetScanner::GetNextInternal() remains
unchanged. Its zero scan slot code path is still served through num_rows
metadata from the parquet footer, while the optimized count star code
path still loops over row groups metadata (also from parquet footer).

HdfsOrcScanner adds optimization_mode_ field, an enum of
OptimizationMode, with three values: NONE, ZERO_SLOT_SCAN, and
OPTIMIZED_COUNT_STAR. It is initialized in HdfsOrcScanner::Open() by
considering multiple factors including the table's Hive ACID compaction
status. HdfsOrcScanner::GetNextInternal() is reorganized to take either
of the three code paths.

The following table shows single-node benchmark result of 3 count query
variant on TPC-DS scale 10, both in ORC and parquet format.

+-----------+---------------------------+-----------------------+--------+-------------+------------+------------+----------------+-------+----------------+---------+-------+
| Workload  | Query                     | File Format           | Avg(s) | Base 
Avg(s) | Delta(Avg) | StdDev(%)  | Base StdDev(%) | Iters | Median Diff(%) | MW 
Zval | Tval  |
+-----------+---------------------------+-----------------------+--------+-------------+------------+------------+----------------+-------+----------------+---------+-------+
| TPCDS(10) | TPCDS-Q_COUNT_ZERO_SLOT   | parquet / none / none | 0.18   | 0.17 
       |   +4.92%   | * 25.40% * | * 17.73% *     | 9     |   +1.32%       | 
1.05    | 0.46  |
| TPCDS(10) | TPCDS-Q_COUNT_ZERO_SLOT   | orc / def / block     | 0.16   | 0.15 
       |   +4.02%   | * 13.21% * | * 11.03% *     | 9     |   +1.20%       | 
1.58    | 0.69  |
| TPCDS(10) | TPCDS-Q_COUNT_UNOPTIMIZED | orc / def / block     | 0.37   | 0.36 
       |   +0.33%   |   7.56%    |   5.94%        | 9     |   -0.21%       | 
-0.32   | 0.10  |
| TPCDS(10) | TPCDS-Q_COUNT_OPTIMIZED   | parquet / none / none | 0.22   | 0.23 
       |   -3.63%   | * 17.21% * | * 26.61% *     | 9     |   +1.81%       | 
0.53    | -0.35 |
| TPCDS(10) | TPCDS-Q_COUNT_UNOPTIMIZED | parquet / none / none | 0.27   | 0.28 
       |   -2.28%   |   9.00%    | * 10.28% *     | 9     |   +0.13%       | 
0.21    | -0.51 |
| TPCDS(10) | TPCDS-Q_COUNT_OPTIMIZED   | orc / def / block     | 0.16   | 0.20 
       | I -19.23%  | * 10.73% * | * 16.89% *     | 9     | I -32.40%      | 
-1.89   | -3.04 |
+-----------+---------------------------+-----------------------+--------+-------------+------------+------------+----------------+-------+----------------+---------+-------+

We also benchmark this patch on parquet TPC-DS scale 3000 and measure
the MaterializeTupleTime. Table below shows average and sttdev of
MaterializeTupleTime in seconds, before and after patch.

+---------------------------+----------+--------------+----------+--------------+
|          SqlName          | Base Avg | Base Stddev  |  New Avg |  New Stddev  
|
+---------------------------+----------+--------------+----------+--------------+
| TPCDS-Q_COUNT_OPTIMIZED   | 0.000002 | 2.618618e-07 | 0.000002 | 3.003614e-07 
|
| TPCDS-Q_COUNT_UNOPTIMIZED | 1.242631 | 3.420549e-02 | 1.257202 | 3.562676e-02 
|
| TPCDS-Q_COUNT_ZERO_SLOT   | 0.027607 | 1.810048e-03 | 0.027012 | 1.531121e-03 
|
+---------------------------+----------+--------------+----------+--------------+

Testing:
- Added 3 count query variant into tpcds workload:
  TPCDS-Q_COUNT_OPTIMIZED, TPCDS-Q_COUNT_UNOPTIMIZED, and
  TPCDS-Q_COUNT_ZERO_SLOT.
- Restore PlannerTest.testOrcStatsAgg
- Restore TestAggregationQueriesRunOnce and
  TestAggregationQueriesRunOnce::test_orc_count_star_optimization
- Exercise count(star) in TestOrc::test_misaligned_orc_stripes
- Pass core tests

Change-Id: I5971c8f278e1dee44e2a8dd4d2f043d22ebf5d17
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/hdfs-scanner.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/orc/hdfs-orc-scanner.cc
M be/src/exec/orc/hdfs-orc-scanner.h
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/exec/parquet/hdfs-parquet-scanner.h
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
A testdata/workloads/functional-planner/queries/PlannerTest/orc-stats-agg.test
A testdata/workloads/functional-query/queries/QueryTest/orc-stats-agg.test
M testdata/workloads/functional-query/queries/QueryTest/partition-key-scans.test
M testdata/workloads/functional-query/queries/QueryTest/scanners.test
M tests/query_test/test_aggregation.py
M tests/query_test/test_scanners.py
M tests/util/test_file_parser.py
18 files changed, 905 insertions(+), 167 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/19927/2
-- 
To view, visit http://gerrit.cloudera.org:8080/19927
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I5971c8f278e1dee44e2a8dd4d2f043d22ebf5d17
Gerrit-Change-Number: 19927
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: David Rorke <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Quanlong Huang <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>

Reply via email to