lipeng...@sensorsdata.cn has uploaded a new patch set (#12). ( http://gerrit.cloudera.org:8080/18574 )
Change subject: IMPALA-11279: Optimize plain count(*) queries for Iceberg tables ...................................................................... IMPALA-11279: Optimize plain count(*) queries for Iceberg tables This commit optimizes the plain count(*) queries for the Iceberg tables. When the `org.apache.iceberg.SnapshotSummary#TOTAL_RECORDS_PROP` can be retrieved from the current `org.apache.iceberg.BaseSnapshot#summary` of the Iceberg table, this kind of query can be very fast. If this property is not retrieved, the query will aggregate the `num_rows` of parquet `file_metadata_` as usual. Queries that can be optimized need to meet the following requirements: - SelectStmt does not have WHERE clause - SelectStmt does not have GROUP BY clause - SelectStmt does not have HAVING clause - The TableRefs of FROM clause contains only one BaseTableRef - Only for the Iceberg table - SelectList contains only 'count(*)' or 'count(constant)' Testing: - Added end-to-end test - Existing tests - Test it in a real cluster Change-Id: I8e9c48bbba7ab2320fa80915e7001ce54f1ef6d9 --- M fe/src/main/java/org/apache/impala/analysis/AnalysisContext.java M fe/src/main/java/org/apache/impala/analysis/Analyzer.java M fe/src/main/java/org/apache/impala/analysis/FunctionCallExpr.java M fe/src/main/java/org/apache/impala/analysis/SelectStmt.java M fe/src/main/java/org/apache/impala/catalog/FeFsTable.java A fe/src/main/java/org/apache/impala/rewrite/CountStarToConstRule.java M testdata/workloads/functional-query/queries/QueryTest/iceberg-compound-predicate-push-down.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-in-predicate-push-down.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-is-null-predicate-push-down.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-partitioned-insert.test A testdata/workloads/functional-query/queries/QueryTest/iceberg-plain-count-star-optimization.test M testdata/workloads/functional-query/queries/QueryTest/iceberg-upper-lower-bound-metrics.test M tests/query_test/test_iceberg.py 13 files changed, 415 insertions(+), 16 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/74/18574/12 -- To view, visit http://gerrit.cloudera.org:8080/18574 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: I8e9c48bbba7ab2320fa80915e7001ce54f1ef6d9 Gerrit-Change-Number: 18574 Gerrit-PatchSet: 12 Gerrit-Owner: Anonymous Coward <lipeng...@sensorsdata.cn> Gerrit-Reviewer: Anonymous Coward <lipeng...@sensorsdata.cn> Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com> Gerrit-Reviewer: Gergely Fürnstáhl <gfurnst...@cloudera.com> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com> Gerrit-Reviewer: Jian Zhang <zjsar...@gmail.com> Gerrit-Reviewer: Tamas Mate <tma...@apache.org> Gerrit-Reviewer: Xianqing He <hexianqing...@126.com> Gerrit-Reviewer: Zoltan Borok-Nagy <borokna...@cloudera.com>