Hello Michael Smith, Impala Public Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/22046
to look at the new patch set (#5).
Change subject: IMPALA-13526: Fix Agg node creation order in DistributedPlanner
......................................................................
IMPALA-13526: Fix Agg node creation order in DistributedPlanner
Within DistributedPlanner.java, there are several place where Planner
need to insert extra merge aggregation node. It require transferring
HAVING conjuncts from preaggregation node to merge aggregation,
unsetting limit, and recompute stats of preaggregation node. However,
the stats recompute is not consistently done, and there might be an
inefficient recompute happening.
This patch fixes the order of AggregationNode creation order in
DistributedPlanner.java so that stats recomputation is done consistently
and efficiently.
This patch also fixes a bug where the cardinality estimate of MERGE
phase aggregation is not capped against the output cardinality of
EXCHANGE node.
Skip tuple-based optimization if conjunct (HAVING predicate) ever
assigned to the AggregationNode. The optimization skip causes following
PlannerTest (under
testdata/workloads/functional-planner/queries/PlannerTest/) to revert
their cardinality estimation to their state pior to IMPALA-13405:
tpcds/tpcds-q39a.test
tpcds/tpcds-q39b.test
tpcds_cpu_cost/tpcds-q39a.test
tpcds_cpu_cost/tpcds-q39b.test
Testing:
- Enable cardinality validation in testMultipleDistinct*
- Update aggregation.test to reflect current PlannerTest output.
Added some test cases as well.
- Pass core tests.
Change-Id: Ica8227fdc46a1ef59bef5ae5424ba3907827411d
---
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M fe/src/main/java/org/apache/impala/planner/DistributedPlanner.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/test/java/org/apache/impala/planner/PlannerTest.java
M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test
M
testdata/workloads/functional-planner/queries/PlannerTest/multiple-distinct-materialization.test
M
testdata/workloads/functional-planner/queries/PlannerTest/multiple-distinct-predicates.test
M
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q39a.test
M
testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q39b.test
M
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q39a.test
M
testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q39b.test
11 files changed, 1,362 insertions(+), 474 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/46/22046/5
--
To view, visit http://gerrit.cloudera.org:8080/22046
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ica8227fdc46a1ef59bef5ae5424ba3907827411d
Gerrit-Change-Number: 22046
Gerrit-PatchSet: 5
Gerrit-Owner: Riza Suminto <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>