Qifan Chen has uploaded a new patch set (#12). ( 
http://gerrit.cloudera.org:8080/16720 )

Change subject: IMPALA-10325 Parquet scan should use min/max statistics to skip 
pages based on equi-join predicate
......................................................................

IMPALA-10325 Parquet scan should use min/max statistics to skip pages based on 
equi-join predicate

This patch adds the logic to utilize min/max stats for Parquet row
groups or pages to skip these entities when they don't qualify an
equi-join predicate.

A new class of predicates called overlap predicates is introduced to aid
in the determination of whether a Parquet row group or a page overlap
with the a range computed from the hash join. If not, then the entire
Parquet row group or the page are skipped. The new class of predicates
co-exist with the existing min/max conjuncts that are introduced based
on the local scan predicates.

Both classes of predicates can work individually or togther with each
other. The overlap predicates are evaualted after the existing min/max
conjuncts.

To be done:
1. Handle all data types;
2. Unit/performance testing;
3. Core testing.

Change-Id: I379405ee75b14929df7d6b5d20dabc6f51375691
---
M be/src/exec/exec-node.h
M be/src/exec/hdfs-scan-node-base.cc
M be/src/exec/hdfs-scan-node-base.h
M be/src/exec/parquet/hdfs-parquet-scanner.cc
M be/src/exec/parquet/hdfs-parquet-scanner.h
M be/src/exec/parquet/parquet-column-stats.cc
M be/src/exec/parquet/parquet-column-stats.h
M be/src/exec/partitioned-hash-join-builder.cc
M be/src/exec/scan-node.cc
M be/src/runtime/coordinator.cc
M common/thrift/PlanNodes.thrift
M fe/src/main/java/org/apache/impala/analysis/TupleDescriptor.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
14 files changed, 386 insertions(+), 20 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/20/16720/12
--
To view, visit http://gerrit.cloudera.org:8080/16720
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I379405ee75b14929df7d6b5d20dabc6f51375691
Gerrit-Change-Number: 16720
Gerrit-PatchSet: 12
Gerrit-Owner: Qifan Chen <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Reviewer: Zoltan Borok-Nagy <[email protected]>

Reply via email to