Zoltan Borok-Nagy has submitted this change and it was merged. ( 
http://gerrit.cloudera.org:8080/19494 )

Change subject: IMPALA-11802: Optimize count(*) queries for Iceberg V2 position 
delete tables
......................................................................

IMPALA-11802: Optimize count(*) queries for Iceberg V2 position delete tables

The SCAN plan of count star query for Iceberg V2 position delete tables
as follows:

    AGGREGATE
    COUNT(*)
        |
    UNION ALL
   /         \
  /           \
 /             \
SCAN all    ANTI JOIN
datafiles  /         \
without   /           \
deletes  SCAN         SCAN
         datafiles    deletes

Since Iceberg provides the number of records in a file(record_count), we
can use this to optimize a simple count star query for Iceberg V2
position delete tables. Firstly, the number of records of all DataFiles
without corresponding DeleteFiles can be calculated by Iceberg meta
files. And then rewrite the query as follows:

      ArithmeticExpr(ADD)
      /             \
     /               \
    /                 \
record_count       AGGREGATE
of all             COUNT(*)
datafiles              |
without            ANTI JOIN
deletes           /         \
                 /           \
                SCAN        SCAN
                datafiles   deletes

Testing:
 * Existing tests
 * Added e2e tests

Change-Id: I8172c805121bf91d23fe063f806493afe2f03d41
Reviewed-on: http://gerrit.cloudera.org:8080/19494
Tested-by: Impala Public Jenkins <[email protected]>
Reviewed-by: Zoltan Borok-Nagy <[email protected]>
---
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/analysis/Analyzer.java
M fe/src/main/java/org/apache/impala/analysis/Expr.java
M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/planner/IcebergScanPlanner.java
M fe/src/main/java/org/apache/impala/rewrite/CountStarToConstRule.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/iceberg-v2-tables.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-plain-count-star-optimization.test
A 
testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-plain-count-star-optimization.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-read-position-deletes-orc.test
M 
testdata/workloads/functional-query/queries/QueryTest/iceberg-v2-read-position-deletes.test
M tests/query_test/test_iceberg.py
13 files changed, 485 insertions(+), 51 deletions(-)

Approvals:
  Impala Public Jenkins: Verified
  Zoltan Borok-Nagy: Looks good to me, approved

--
To view, visit http://gerrit.cloudera.org:8080/19494
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: merged
Gerrit-Change-Id: I8172c805121bf91d23fe063f806493afe2f03d41
Gerrit-Change-Number: 19494
Gerrit-PatchSet: 7
Gerrit-Owner: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Anonymous Coward <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Tamas Mate <[email protected]>
Gerrit-Reviewer: Xiaoqing Gao <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>

Reply via email to