Riza Suminto has uploaded a new patch set (#59) to the change originally
created by Qifan Chen. ( http://gerrit.cloudera.org:8080/19033 )
Change subject: IMPALA-11604 (part 1): Model ProcessingCost for PlanNodes &
DataSink
......................................................................
IMPALA-11604 (part 1): Model ProcessingCost for PlanNodes & DataSink
This patch augments IMPALA-10992 by establishing a model to allow the
weighted total amount of data to process to be used as a new factor in
the definition and selection of an executor group. We call this model
ProcessingCost.
ProcessingCost of a PlanNode/DataSink is a weighted amount of data
processed by that node/sink. The basic ProcessingCost is computed with a
general formula as follows.
ProcessingCost is a pair: PC(D, N), where D = I * (C + M)
where D is the weighted amount of data processed
I is the input cardinality
C is the expression evaluation cost per row.
Set to total weight of expression evaluation in node/sink.
M is a materialization cost per row.
Only used by scan and exchange node. Otherwise, 0.
N is the number of instances.
Default to D / 10000000.
In this patch, the weight of each expression evaluation is set to a
constant of 1. A description of the computation for each kind of
PlanNode/DataSink is given below.
01. AggregationNode:
Each AggregateInfo has its C as a sum of grouping expression and
aggregate expression and then assigned a single ProcessingCost
individually. These ProcessingCosts then summed to be the Aggregation
node's ProcessingCost;
02. AnalyticEvalNode:
C is the sum of the evaluation costs for analytic functions;
03. CardinalityCheckNode:
Use the general formula, I = 1;
04. DataSourceScanNode:
Follow the formula from the superclass ScanNode;
05. EmptySetNode:
I = 0;
06. ExchangeNode:
M = (average serialized row size) / 1024
A modification of the general formula when in broadcast mode:
D = D * number of receivers;
07. HashJoinNode:
probe cost = PC(I0 * C(equiJoin predicate), N) +
PC(output cardinality * C(otherJoin predicate), N)
build cost = PC(I1 * C(equi-join predicate), N)
With I0 and I1 as input cardinality of the probe and build side
accordingly. If the plan node does not have a separate build, ProcessingCost
is the sum of probe cost and build cost. Otherwise, ProcessingCost is
equal to probeCost.
08. HbaseScanNode, HdfsScanNode, and KuduScanNode:
Follow the formula from the superclass ScanNode;
09. Nested loop join node:
When the right child is not a SingularRowSrcNode:
probe cost = PC(I0 * C(equiJoin predicate), N) +
PC(output cardinality * C(otherJoin predicate), N)
build cost = PC(I1 * C(equiJoin predicate), N)
When the right child is a SingularRowSrcNode:
probe cost = PC(I0, N)
build cost = PC(I0 * I1, N)
With I0 and I1 as input cardinality of the probe and build side
accordingly. If the plan node does not have a separate build, ProcessingCost
is the sum of probe cost and build cost. Otherwise, ProcessingCost is
equal to probeCost.
10. ScanNode:
M = (average row size) / 1024;
11. SelectNode:
Use the general formula;
12. SingularRowSrcNode:
Since the node is involved once per input in nested loop join, the
contribution of this node is computed in nested loop join;
13. SortNode:
C is the evaluation cost for the sort expression;
14. SubplanNode:
C is 1. I is the multiplication of the cardinality of the left and
the right child;
15. Union node:
C is the cost of result expression evaluation from all non-pass-through
children;
16. Unnest node:
I is the cardinality of the containing SubplanNode and C is 1.
17. DataStreamSink:
M = 1 / num rows per batch.
18. JoinBuildSink:
ProcessingCost is the build cost of its associated JoinNode.
19. PlanRootSink:
If result spooling is enabled, C is the cost of output expression
evaluation. Otherwise. ProcessingCost is zero.
20. TableSink:
C is the cost of output expression evaluation.
TableSink subclasses (including HBaseTableSink, HdfsTableSink, and
KuduTableSink) follows the same formula;
Part 2 of IMPALA-11604 will implement an algorithm that tries to adjust
the number of instances for each fragment by considering their
production-consumption ratio, and then finally returns a number
representing an ideal CPU core count required for a query to run
efficiently.
Testing:
- Pass FE tests.
Co-authored-by: Riza Suminto <[email protected]>
Change-Id: If32dc770dfffcdd0be2b5555a789a7720952c68a
---
M fe/src/main/java/org/apache/impala/analysis/AggregateInfo.java
M fe/src/main/java/org/apache/impala/analysis/SortInfo.java
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M fe/src/main/java/org/apache/impala/planner/AnalyticEvalNode.java
A fe/src/main/java/org/apache/impala/planner/BaseProcessingCost.java
A fe/src/main/java/org/apache/impala/planner/BroadcastProcessingCost.java
M fe/src/main/java/org/apache/impala/planner/CardinalityCheckNode.java
M fe/src/main/java/org/apache/impala/planner/DataSink.java
M fe/src/main/java/org/apache/impala/planner/DataSourceScanNode.java
M fe/src/main/java/org/apache/impala/planner/DataStreamSink.java
M fe/src/main/java/org/apache/impala/planner/EmptySetNode.java
M fe/src/main/java/org/apache/impala/planner/ExchangeNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseScanNode.java
M fe/src/main/java/org/apache/impala/planner/HBaseTableSink.java
M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsScanNode.java
M fe/src/main/java/org/apache/impala/planner/HdfsTableSink.java
M fe/src/main/java/org/apache/impala/planner/JoinBuildSink.java
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/KuduTableSink.java
M fe/src/main/java/org/apache/impala/planner/NestedLoopJoinNode.java
M fe/src/main/java/org/apache/impala/planner/PlanNode.java
M fe/src/main/java/org/apache/impala/planner/PlanRootSink.java
A fe/src/main/java/org/apache/impala/planner/ProcessingCost.java
A fe/src/main/java/org/apache/impala/planner/ScaledProcessingCost.java
M fe/src/main/java/org/apache/impala/planner/ScanNode.java
M fe/src/main/java/org/apache/impala/planner/SelectNode.java
M fe/src/main/java/org/apache/impala/planner/SingularRowSrcNode.java
M fe/src/main/java/org/apache/impala/planner/SortNode.java
M fe/src/main/java/org/apache/impala/planner/SubplanNode.java
A fe/src/main/java/org/apache/impala/planner/SumProcessingCost.java
M fe/src/main/java/org/apache/impala/planner/TableSink.java
M fe/src/main/java/org/apache/impala/planner/UnionNode.java
M fe/src/main/java/org/apache/impala/planner/UnnestNode.java
M fe/src/main/java/org/apache/impala/util/ExprUtil.java
36 files changed, 1,033 insertions(+), 24 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/33/19033/59
--
To view, visit http://gerrit.cloudera.org:8080/19033
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: If32dc770dfffcdd0be2b5555a789a7720952c68a
Gerrit-Change-Number: 19033
Gerrit-PatchSet: 59
Gerrit-Owner: Qifan Chen <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Kurt Deschler <[email protected]>
Gerrit-Reviewer: Qifan Chen <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Reviewer: Wenzhe Zhou <[email protected]>