liuyao has uploaded a new patch set (#3). ( http://gerrit.cloudera.org:8080/16842 )
Change subject: IMPALA-10377 Improve the accuracy of resource estimation PlanNode does not consider some factors when estimating memory, this will cause a large error rate ...................................................................... IMPALA-10377 Improve the accuracy of resource estimation PlanNode does not consider some factors when estimating memory, this will cause a large error rate AggregationNode 1.The memory occupied by hash table's own data structure is not considered. Hash table inserts a new value, which will add a bucket. The size of a bucket is 16 bytes. 2.When estimating the NDV of merge aggregation, if there are multiple grouping exprs, it may be divided by the number of Fragment Instances several times, and it should be divided only once. 3.When estimating the NDV of merge aggregation, and there are multiple grouping exprs, the estimated memory is much smaller than the actual use. 4.If there is no grouping exprs, the estimated memory is much larger than the actual use. 5.If the NDV of grouping exprs is very small, the estimated memory is much larger than the actual use. SortNode 1.Estimate the memory usage of external sort. the estimated memory is much smaller than the actual use. HashJoinNode 1.The memory occupied by hash table's own data structure is not considered.Hash Table will keep duplicate data, so the size of DuplicateNode should be considered. 2.Hash table will create multiple buckets in advance. The size of these buckets should be considered. KuduScanNode 1.Estimate memory by scanning all columns,the estimated memory is much larger than the actual use. Change-Id: Ic01db168ff2c6d6de33ee553a8175599f035d7a1 --- M fe/src/main/java/org/apache/impala/planner/AggregationNode.java M fe/src/main/java/org/apache/impala/planner/HashJoinNode.java M fe/src/main/java/org/apache/impala/planner/JoinNode.java M fe/src/main/java/org/apache/impala/planner/KuduScanNode.java M fe/src/main/java/org/apache/impala/planner/PlanFragment.java M fe/src/main/java/org/apache/impala/planner/PlannerContext.java M fe/src/main/java/org/apache/impala/planner/SortNode.java M fe/src/test/java/org/apache/impala/planner/CardinalityTest.java M testdata/workloads/functional-planner/queries/PlannerTest/acid-scans.test M testdata/workloads/functional-planner/queries/PlannerTest/aggregation.test M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns-mt-dop.test M testdata/workloads/functional-planner/queries/PlannerTest/analytic-fns.test M testdata/workloads/functional-planner/queries/PlannerTest/bloom-filter-assignment.test M testdata/workloads/functional-planner/queries/PlannerTest/card-inner-join.test M testdata/workloads/functional-planner/queries/PlannerTest/card-multi-join.test M testdata/workloads/functional-planner/queries/PlannerTest/card-outer-join.test M testdata/workloads/functional-planner/queries/PlannerTest/complex-types-file-formats.test M testdata/workloads/functional-planner/queries/PlannerTest/conjunct-ordering.test M testdata/workloads/functional-planner/queries/PlannerTest/constant-folding.test M testdata/workloads/functional-planner/queries/PlannerTest/constant-propagation.test M testdata/workloads/functional-planner/queries/PlannerTest/convert-to-cnf.test M testdata/workloads/functional-planner/queries/PlannerTest/ddl.test M testdata/workloads/functional-planner/queries/PlannerTest/default-join-distr-mode-broadcast.test M testdata/workloads/functional-planner/queries/PlannerTest/default-join-distr-mode-shuffle.test M testdata/workloads/functional-planner/queries/PlannerTest/disable-codegen.test M testdata/workloads/functional-planner/queries/PlannerTest/disable-preaggregations.test M testdata/workloads/functional-planner/queries/PlannerTest/distinct-estimate.test M testdata/workloads/functional-planner/queries/PlannerTest/distinct.test M testdata/workloads/functional-planner/queries/PlannerTest/empty.test M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection-hdfs-num-rows-est-enabled.test M testdata/workloads/functional-planner/queries/PlannerTest/fk-pk-join-detection.test M testdata/workloads/functional-planner/queries/PlannerTest/hbase.test M testdata/workloads/functional-planner/queries/PlannerTest/hdfs.test M testdata/workloads/functional-planner/queries/PlannerTest/implicit-joins.test M testdata/workloads/functional-planner/queries/PlannerTest/inline-view-limit.test M testdata/workloads/functional-planner/queries/PlannerTest/inline-view.test M testdata/workloads/functional-planner/queries/PlannerTest/insert-hdfs-writer-limit.test M testdata/workloads/functional-planner/queries/PlannerTest/insert-sort-by-zorder.test M testdata/workloads/functional-planner/queries/PlannerTest/joins-hdfs-num-rows-est-enabled.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu-delete.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu-update.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu-upsert.test M testdata/workloads/functional-planner/queries/PlannerTest/kudu.test M testdata/workloads/functional-planner/queries/PlannerTest/limit-pushdown-analytic.test M testdata/workloads/functional-planner/queries/PlannerTest/max-row-size.test M testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters-hdfs-num-rows-est-enabled.test M testdata/workloads/functional-planner/queries/PlannerTest/min-max-runtime-filters.test M testdata/workloads/functional-planner/queries/PlannerTest/mt-dop-validation.test M testdata/workloads/functional-planner/queries/PlannerTest/multiple-distinct-materialization.test M testdata/workloads/functional-planner/queries/PlannerTest/multiple-distinct-predicates.test M testdata/workloads/functional-planner/queries/PlannerTest/nested-collections.test M testdata/workloads/functional-planner/queries/PlannerTest/optimize-simple-limit.test A testdata/workloads/functional-planner/queries/PlannerTest/optimize-simple-limit.test.BACKUP.16699.test A testdata/workloads/functional-planner/queries/PlannerTest/optimize-simple-limit.test.BASE.16699.test A testdata/workloads/functional-planner/queries/PlannerTest/optimize-simple-limit.test.LOCAL.16699.test A testdata/workloads/functional-planner/queries/PlannerTest/optimize-simple-limit.test.REMOTE.16699.test M testdata/workloads/functional-planner/queries/PlannerTest/order.test M testdata/workloads/functional-planner/queries/PlannerTest/outer-to-inner-joins.test M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering-disabled.test M testdata/workloads/functional-planner/queries/PlannerTest/parquet-filtering.test M testdata/workloads/functional-planner/queries/PlannerTest/parquet-stats-agg.test M testdata/workloads/functional-planner/queries/PlannerTest/partition-pruning.test M testdata/workloads/functional-planner/queries/PlannerTest/preagg-bytes-limit.test M testdata/workloads/functional-planner/queries/PlannerTest/resource-requirements.test M testdata/workloads/functional-planner/queries/PlannerTest/result-spooling.test M testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-propagation.test M testdata/workloads/functional-planner/queries/PlannerTest/runtime-filter-query-options.test M testdata/workloads/functional-planner/queries/PlannerTest/scan-node-fs-scheme.test M testdata/workloads/functional-planner/queries/PlannerTest/semi-join-distinct.test M testdata/workloads/functional-planner/queries/PlannerTest/sort-expr-materialization.test M testdata/workloads/functional-planner/queries/PlannerTest/spillable-buffer-sizing.test M testdata/workloads/functional-planner/queries/PlannerTest/tablesample.test M testdata/workloads/functional-planner/queries/PlannerTest/topn-bytes-limit-small.test M testdata/workloads/functional-planner/queries/PlannerTest/topn-bytes-limit.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q01.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q02.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q04.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q05.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q06.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q07.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q08.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q09.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q10a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q11.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q12.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q13.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q14b.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q15.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q16.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q17.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q18.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q19.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q20.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q21.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q22.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q23a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q23b.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q24a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q24b.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q25.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q26.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q27.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q28.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q29.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q31.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q32.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q34.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q35a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q36.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q37.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q38.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q39a.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q39b.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q40.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q44.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q45.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q46.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q47.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q48.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q49.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q50.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q51.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q54.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q56.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q57.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q58.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q59.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q60.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q61.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q64.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q65.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q66.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q67.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q68.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q69.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q71.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q72.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q73.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q74.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q75.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q76.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q77.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q78.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q79.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q80.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q81.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q82.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q83.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q85.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q86.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q87.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q88.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q90.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q91.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q92.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q93.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q94.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q95.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q96.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q97.test M testdata/workloads/functional-planner/queries/PlannerTest/tpcds/tpcds-q98.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-all.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-kudu.test M testdata/workloads/functional-planner/queries/PlannerTest/tpch-nested.test M testdata/workloads/functional-planner/queries/PlannerTest/with-clause.test M testdata/workloads/functional-query/queries/QueryTest/admission-max-min-mem-limits.test M testdata/workloads/functional-query/queries/QueryTest/admission-reject-mem-estimate.test M testdata/workloads/functional-query/queries/QueryTest/admission-reject-min-reservation.test M testdata/workloads/functional-query/queries/QueryTest/dedicated-coord-mem-estimates.test M testdata/workloads/functional-query/queries/QueryTest/explain-level2.test M tests/query_test/test_mem_usage_scaling.py 172 files changed, 6,478 insertions(+), 5,017 deletions(-) git pull ssh://gerrit.cloudera.org:29418/Impala-ASF refs/changes/42/16842/3 -- To view, visit http://gerrit.cloudera.org:8080/16842 To unsubscribe, visit http://gerrit.cloudera.org:8080/settings Gerrit-Project: Impala-ASF Gerrit-Branch: master Gerrit-MessageType: newpatchset Gerrit-Change-Id: Ic01db168ff2c6d6de33ee553a8175599f035d7a1 Gerrit-Change-Number: 16842 Gerrit-PatchSet: 3 Gerrit-Owner: liuyao <liu...@sensorsdata.cn> Gerrit-Reviewer: Impala Public Jenkins <impala-public-jenk...@cloudera.com>