Hello Tamas Mate, Qifan Chen, Impala Public Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/17960
to look at the new patch set (#3).
Change subject: IMPALA-10777: Enable min/max filtering for Iceberg partitions
......................................................................
IMPALA-10777: Enable min/max filtering for Iceberg partitions
This patch enables min/max filters for Iceberg columns that
participate in table partitioning. The min/max filters are
evaluated at the Parquet row group level. This means that it
is still slower than dynamic partition pruning (which doesn't
even need to open the files), but much faster than no pruning at all.
Performance
I used the following query to measure perf on a scale 10 TPC-DS
dataset:
select i_item_id,sum(ss_ext_sales_price) total_sales
from
store_sales,
date_dim,
customer_address,
item
where i_item_id in (select
i_item_id
from item
where i_color in ('orchid','chiffon','lace'))
and ss_item_sk = i_item_sk
and ss_sold_date_sk = d_date_sk
and d_year = 2000
and d_moy = 1
and ss_addr_sk = ca_address_sk
and ca_gmt_offset = -8
The above query took the following times to execute:
Regular Parquet table: 1.16s
Iceberg table without min/max filters: 4.39s
Iceberg table with min/max filters: 1.77s
Testing:
* added e2e test
* planner test could not be added because Iceberg tables behave
differently during planner tests (due to some hacks that needs
refactoring)
Change-Id: I51b53188c6da7eeebfeae385e1de31ace0980cac
---
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/exec/parquet/hdfs-parquet-scanner.h
M be/src/runtime/runtime-filter.h
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/catalog/FeIcebergTable.java
M fe/src/main/java/org/apache/impala/catalog/FeTable.java
M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
M testdata/workloads/functional-query/queries/QueryTest/min_max_filters.test
8 files changed, 80 insertions(+), 10 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/60/17960/3
--
To view, visit http://gerrit.cloudera.org:8080/17960
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I51b53188c6da7eeebfeae385e1de31ace0980cac
Gerrit-Change-Number: 17960
Gerrit-PatchSet: 3
Gerrit-Owner: Zoltan Borok-Nagy <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Tamas Mate <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>