[
https://issues.apache.org/jira/browse/IMPALA-13333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Riza Suminto resolved IMPALA-13333.
-----------------------------------
Fix Version/s: Impala 5.0.0
Target Version: Impala 5.0.0
Resolution: Fixed
> Curb memory estimation for SORT node
> ------------------------------------
>
> Key: IMPALA-13333
> URL: https://issues.apache.org/jira/browse/IMPALA-13333
> Project: IMPALA
> Issue Type: Improvement
> Components: Frontend
> Reporter: Riza Suminto
> Assignee: Riza Suminto
> Priority: Major
> Fix For: Impala 5.0.0
>
>
> High cardinality overestimation can lead to severe memory overestimation for
> SORT node, even in Parallel Plan. TPC-DS Q31 and Q51 plan against synthetic
> 3TB scale workload shows such huge overestimation:
> [https://github.com/apache/impala/blob/ae6a3b9ec058dfea4b4f93d4828761f792f0b55e/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q31.test#L1319-L1323]
> [https://github.com/apache/impala/blob/ae6a3b9ec058dfea4b4f93d4828761f792f0b55e/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q51.test#L511-L515]
> Planner should be aware to not estimate terabytes/petabytes of memory for
> SORT node, knowing that SORT node has ability to spill-to-disk under memory
> pressure. Planner can also take account for SORT_RUN_BYTES_LIMIT or
> MAX_SORT_RUN_SIZE option value to come up with lower memory estimate.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)