Riza Suminto has uploaded this change for review. ( http://gerrit.cloudera.org:8080/19927
Change subject: IMPALA-11123: Reimplement ORC optimized count star ...................................................................... IMPALA-11123: Reimplement ORC optimized count star Commit 7ca20b3c94b1c9c1ddd4ed1e89f0969a0df55330 revert the original optimized count(star) for ORC scan from commit f932d78ad0a30e322d59fc39072f710f889d2135 (gerrit review http://gerrit.cloudera.org:8080/18327). The revert is necessary since the unification of count star and zero slot functions into HdfsColumnarScanner and causing significant regression for non-optimized counts star query in parquet format (over 15% slower MaterializeTupleTime). This patch reimplements optimized count(star) for ORC scan code path while minimizing the code changes needed for parquet scan code path. After this patch, ORC and parquet code path will have only the following new things in common: - THdfsScanNode.count_star_slot_offset renamed to THdfsScanNode.star_slot_offset - New field is_footer_scanner_ added in HdfsColumnarScanner to memorize if the scanner's scan range is a footer range or not. - Static function HdfsScanner::IssueFooterRanges adds one function parameter is_read_metadata_only. Parquet scan will pass ReadsParquetMetadataOnly function as argument for is_read_metadata_only parameter, while ORC scan will pass ReadsOrcMetadataOnly function. ReadsParquetMetadataOnly only checks for HdfsScanNodeBase::IsZeroSlotTableScan(), thus behavior remains unchanged for parquet scan code path. On the other hand, ReadsOrcMetadataOnly is more elaborate by additionally checking for HdfsScanNodeBase::optimize_count_star() and Hive ACID compaction status. The structure of HdfsParquetScanner::GetNextInternal() remains unchanged. Its zero scan slot code path is still served through num_rows metadata from the parquet footer, while the optimized count star code path still loops over row groups. HdfsOrcScanner adds optimization_mode_ field, an enum of OptimizationMode, with three values: NONE, ZERO_SLOT_SCAN, and OPTIMIZED_COUNT_STAR. It is initialized in HdfsOrcScanner::Open() by considering multiple factors including the table's Hive ACID compaction status. HdfsOrcScanner::GetNextInternal() is reorganized to take either of the three code paths. Unlike HdfsParquetScanner, both the optimized count star and zero slot scan code path of HdfsOrcScanner can be served through ORC file metadata. The following table shows single-node benchmark result of 3 count query variant on TPC-DS scale 10, both in ORC and parquet format. +-----------+---------------------------+-----------------------+--------+-------------+------------+------------+----------------+-------+----------------+---------+-------+ | Workload | Query | File Format | Avg(s) | Base Avg(s) | Delta(Avg) | StdDev(%) | Base StdDev(%) | Iters | Median Diff(%) | MW Zval | Tval | +-----------+---------------------------+-----------------------+--------+-------------+------------+------------+----------------+-------+----------------+---------+-------+ | TPCDS(10) | TPCDS-Q_COUNT_ZERO_SLOT | parquet / none / none | 0.18 | 0.17 | +4.92% | * 25.40% * | * 17.73% * | 9 | +1.32% | 1.05 | 0.46 | | TPCDS(10) | TPCDS-Q_COUNT_ZERO_SLOT | orc / def / block | 0.16 | 0.15 | +4.02% | * 13.21% * | * 11.03% * | 9 | +1.20% | 1.58 | 0.69 | | TPCDS(10) | TPCDS-Q_COUNT_UNOPTIMIZED | orc / def / block | 0.37 | 0.36 | +0.33% | 7.56% | 5.94% | 9 | -0.21% | -0.32 | 0.10 | | TPCDS(10) | TPCDS-Q_COUNT_OPTIMIZED | parquet / none / none | 0.22 | 0.23 | -3.63% | * 17.21% * | * 26.61% * | 9 | +1.81% | 0.53 | -0.35 | | TPCDS(10) | TPCDS-Q_COUNT_UNOPTIMIZED | parquet / none / none | 0.27 | 0.28 | -2.28% | 9.00% | * 10.28% * | 9 | +0.13% | 0.21 | -0.51 | | TPCDS(10) | TPCDS-Q_COUNT_OPTIMIZED | orc / def / block | 0.16 | 0.20 | I -19.23% | * 10.73% * | * 16.89% * | 9 | I -32.40% | -1.89 | -3.04 | +-----------+---------------------------+-----------------------+--------+-------------+------------+------------+----------------+-------+----------------+---------+-------+ We also benchmark this patch on parquet TPC-DS scale 3000 and measure the MaterializeTupleTime. Table below shows average and sttdev of MaterializeTupleTime in seconds, before and after patch. +---------------------------+----------+--------------+----------+--------------+ | SqlName | Base Avg | Base Stddev | New Avg | New Stddev | +---------------------------+----------+--------------+----------+--------------+ | TPCDS-Q_COUNT_OPTIMIZED | 0.000002 | 2.618618e-07 | 0.000002 | 3.003614e-07 | | TPCDS-Q_COUNT_UNOPTIMIZED | 1.242631 | 3.420549e-02 | 1.257202 | 3.562676e-02 | | TPCDS-Q_COUNT_ZERO_SLOT | 0.027607 | 1.810048e-03 | 0.027012 | 1.531121e-03 | +---------------------------+----------+--------------+----------+--------------+ Testing: - Added 3 count query variant into tpcds workload: TPCDS-Q_COUNT_OPTIMIZED, TPCDS-Q_COUNT_UNOPTIMIZED, and TPCDS-Q_COUNT_ZERO_SLOT. - Restore PlannerTest.testOrcStatsAgg - Restore TestAggregationQueriesRunOnce and TestAggregationQueriesRunOnce::test_orc_count_star_optimization - Exercise count(star) in TestOrc::test_misaligned_orc_stripes - Pass core tests Change-Id: I5971c8f278e1dee44e2a8dd4d2f043d22ebf5d17 --- M be/src/exec/hdfs-columnar-scanner.cc M be/src/exec/hdfs-columnar-scanner.h M be/src/exec/hdfs-scan-node-base.cc M be/src/exec/hdfs-scan-node-base.h M be/src/exec/hdfs-scanner.cc M be/src/exec/hdfs-scanner.h M be/src/exec/orc/hdfs-orc-scanner.cc M be/src/exec/orc/hdfs-orc-scanner.h M be/src/exec/parquet/hdfs-parquet-scanner.cc M be/src/exec/parquet/hdfs-parquet-scanner.h M common/thrift/PlanNodes.thrift M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java M fe/src/test/java/org/apache/impala/planner/PlannerTest.java A testdata/workloads/functional-planner/queries/PlannerTest/orc-stats-agg.test A testdata/workloads/functional-query/queries/QueryTest/orc-stats-agg.test M testdata/workloads/functional-query/queries/QueryTest/partition-key-scans.test M testdata/workloads/functional-query/queries/QueryTest/scanners.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q_count_optimized.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q_count_unoptimized.test A testdata/workloads/tpcds/queries/tpcds-decimal_v2-q_count_zero_slot.test M tests/query_test/test_aggregation.py M tests/query_test/test_scanners.py M tests/util/test_file_parser.py 23 files changed, 931 insertions(+), 153 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/27/19927/1 -- To view, visit http://gerrit.cloudera.org:8080/19927 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newchange Gerrit-Change-Id: I5971c8f278e1dee44e2a8dd4d2f043d22ebf5d17 Gerrit-Change-Number: 19927 Gerrit-PatchSet: 1 Gerrit-Owner: Riza Suminto <[email protected]>
