Hello Daniel Becker, Abhishek Rawat, David Rorke, Csaba Ringhofer, Impala 
Public Jenkins,

I'd like you to reexamine a change. Please visit

    http://gerrit.cloudera.org:8080/20498

to look at the new patch set (#2).

Change subject: IMPALA-12018: Consider runtime filter for cardinality reduction
......................................................................

IMPALA-12018: Consider runtime filter for cardinality reduction

Currently Impala creates a plan first and looks for runtime filters
based on the complete plan. This means cardinality estimate in the query
plan does not incorporate runtime filter selectivity. Actual scan
cardinality from runtime execution is often much lower that the
cardinality estimate due to existence of runtime filter.

This patch applies runtime filter selectivity to lower cardinality
estimates of scan nodes and certain join nodes above them after runtime
filter generation and before resource requirement computation. The
algorithm selects a contigous probe pipeline consisting of a scan node,
exchanges, and reducing join nodes. Depending on whether the join node
produces a runtime filter and the type of that runtime filter, it then
applies the runtime filter selectivity to the scan node to reduce its
cardinality and input cardinality estimate.

This cardinality reduction is currently only applied in cost-based
planning mode (COMPUTE_PROCESSING_COST option is True) to avoid
potential regression in regular planning mode. Cost-based planning mode
can benefit the most from reduced scan cardinality. It can lead towards
ProcessingCost reduction, lower scan fragment parallelism, and increase
chance of query assignment to the smaller executor group set. We can
consider enabling this cardinality reduction technique for all planning
mode after more thorough performance evaluation (which require more
planner test changes).

Testing:
- Pass test_executor_groups.py.
- Pass PlannerTest#testProcessingCost.

Change-Id: I033789c9b63a8188484e3afde8e646563918b3e1
---
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/Planner.java
M fe/src/main/java/org/apache/impala/planner/RuntimeFilterGenerator.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M 
testdata/workloads/functional-planner/queries/PlannerTest/tpcds-processing-cost.test
5 files changed, 465 insertions(+), 253 deletions(-)


  git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/98/20498/2
--
To view, visit http://gerrit.cloudera.org:8080/20498
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: I033789c9b63a8188484e3afde8e646563918b3e1
Gerrit-Change-Number: 20498
Gerrit-PatchSet: 2
Gerrit-Owner: Riza Suminto <riza.sumi...@cloudera.com>
Gerrit-Reviewer: Abhishek Rawat <ara...@cloudera.com>
Gerrit-Reviewer: Csaba Ringhofer <csringho...@cloudera.com>
Gerrit-Reviewer: Daniel Becker <daniel.bec...@cloudera.com>
Gerrit-Reviewer: David Rorke <dro...@cloudera.com>
Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>
Gerrit-Reviewer: Riza Suminto <riza.sumi...@cloudera.com>

Reply via email to