Hello Aman Sinha, Tim Armstrong, Bikramjeet Vig, Impala Public Jenkins,
I'd like you to reexamine a change. Please visit
http://gerrit.cloudera.org:8080/16842
to look at the new patch set (#11).
Change subject: IMPALA-10377: Improve the accuracy of resource estimation
PlanNode does not consider some factors when estimating memory, this will cause
a large error rate
......................................................................
IMPALA-10377: Improve the accuracy of resource estimation
PlanNode does not consider some factors when estimating memory,
this will cause a large error rate
AggregationNode
1.MemoryEstimate = Ndv * (AvgRowSize + SizeOfBucket)
2.When estimating the Ndv of merge aggregation, Ndv should be
divided only once.
3.If there is no grouping exprs, MemoryEstimate =
MIN_PLAIN_AGG_MEM
SortNode
1.MemoryEstimate = Cardinality * AvgRowSize. Memory used when
there is enough memory
HashJoinNode
1.MemoryEstimate= Datas + Buckets + DuplicateNodes,
Datas = RightTableCardinality * AvgRowSize,
Buckets= roundUpToPowerOf2(RightTableCardinality) *
SizeOfBucket,
DuplicateNodes = (RightTableCardinality - RightNdv) *
SizeOfDuplicateNode
KuduScanNode
1.MemoryEstimate = Columns * BytesPerColumn * MaxScannerThreads,
Columns are scanned in query, not all the columns of the table
UnitTest
1.CardinalityTest adds test cases to test memory estimation.
Modify existing test cases related to memory estimation
Change-Id: Ic01db168ff2c6d6de33ee553a8175599f035d7a1
---
M fe/src/main/java/org/apache/impala/planner/AggregationNode.java
M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java
M fe/src/main/java/org/apache/impala/planner/JoinNode.java
M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java
M fe/src/main/java/org/apache/impala/planner/PlanFragment.java
M fe/src/main/java/org/apache/impala/planner/PlannerContext.java
M fe/src/main/java/org/apache/impala/planner/SortNode.java
M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java
M
testdata/workloads/functional-planner/queries/PlannerTest/bloom-filter-assignment.test
M
testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test
M
testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test
M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test
M
testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection-hdfs-num-rows-est-enabled.test
M
testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test
M testdata/workloads/functional-planner/queries/PlannerTest/max-row-size.test
M
testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters-hdfs-num-rows-est-enabled.test
M
testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test
M
testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test
M
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering-disabled.test
M
testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test
M
testdata/workloads/functional-planner/queries/PlannerTest/partition-pruning.test
M
testdata/workloads/functional-planner/queries/PlannerTest/preagg-bytes-limit.test
M
testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test
M testdata/workloads/functional-planner/queries/PlannerTest/result-spooling.test
M
testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-query-options.test
M
testdata/workloads/functional-planner/queries/PlannerTest/sort-expr-materialization.test
M
testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test
M
testdata/workloads/functional-query/queries/QueryTest/admission-max-min-mem-limits.test
M
testdata/workloads/functional-query/queries/QueryTest/admission-reject-mem-estimate.test
M
testdata/workloads/functional-query/queries/QueryTest/admission-reject-min-reservation.test
M
testdata/workloads/functional-query/queries/QueryTest/dedicated-coord-mem-estimates.test
M testdata/workloads/functional-query/queries/QueryTest/explain-level2.test
M tests/query_test/test_mem_usage_scaling.py
33 files changed, 620 insertions(+), 404 deletions(-)
git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/16842/11
--
To view, visit http://gerrit.cloudera.org:8080/16842
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: newpatchset
Gerrit-Change-Id: Ic01db168ff2c6d6de33ee553a8175599f035d7a1
Gerrit-Change-Number: 16842
Gerrit-PatchSet: 11
Gerrit-Owner: liuyao <[email protected]>
Gerrit-Reviewer: Aman Sinha <[email protected]>
Gerrit-Reviewer: Bikramjeet Vig <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Tim Armstrong <[email protected]>
Gerrit-Reviewer: liuyao <[email protected]>