Qifan Chen has uploaded a new patch set (#10). (
http://gerrit.cloudera.org:8080/17568 )
Change subject: IMPALA-10738: Min/max filters should be enabled for partition
columns
......................................................................
IMPALA-10738: Min/max filters should be enabled for partition columns
This patch enables min/max filters for partitoned columns to take
advantage of the min/max filter infrastructure already built by default.
To turn off the feature, set the new query option
minmax_filter_partition_column to false.
In the patch, the existing query option enabled_runtime_filter_types
is utilized to play a role in the types of the filters generated. The
default value ALL generates both the bloom and min/max filters. The
alternative value BLOOM generates only the bloom filters and another
alternative value MIN_MAX generates only the min/max filters.
The normal control knobs minmax_filter_threshold (for threshold) and
minmax_filtering_level (for filtering level) still work. When the
threshold is 0, the patch automatically assigns a reasonable value
for the threshhold.
Testing:
1). Added new tests in
overlap_min_max_filters_on_partition_columns.test;
2). Core tests [TBD]
Change-Id: I89e135ef48b4bb36d70075287b03d1c12496b042
---
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scanner.h
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/exec/parquet/hdfs-parquet-scanner.h
M be/src/runtime/runtime-filter.h
M be/src/service/query-options.cc
M be/src/service/query-options.h
M common/thrift/ImpalaService.thrift
M common/thrift/Query.thrift
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M fe/src/test/java/org/apache/impala/planner/TpcdsPlannerTest.java
M
testdata/workloads/functional-planner/queries/PlannerTest/bloom-filter-assignment.test
M
testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters-hdfs-num-rows-est-enabled.test
M
testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test
M
testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-query-options.test
A
testdata/workloads/functional-query/queries/QueryTest/overlap_min_max_filters_on_partition_columns.test
M tests/query_test/test_runtime_filters.py
20 files changed, 361 insertions(+), 218 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/68/17568/10
--
To view, visit http://gerrit.cloudera.org:8080/17568
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I89e135ef48b4bb36d70075287b03d1c12496b042
Gerrit-Change-Number: 17568
Gerrit-PatchSet: 10
Gerrit-Owner: Qifan Chen <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>