[ 
https://issues.apache.org/jira/browse/IMPALA-13333?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Riza Suminto resolved IMPALA-13333.
-----------------------------------
     Fix Version/s: Impala 5.0.0
    Target Version: Impala 5.0.0
        Resolution: Fixed

> Curb memory estimation for SORT node
> ------------------------------------
>
>                 Key: IMPALA-13333
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13333
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Frontend
>            Reporter: Riza Suminto
>            Assignee: Riza Suminto
>            Priority: Major
>             Fix For: Impala 5.0.0
>
>
> High cardinality overestimation can lead to severe memory overestimation for 
> SORT node, even in Parallel Plan. TPC-DS Q31 and Q51 plan against synthetic 
> 3TB scale workload shows such huge overestimation:
> [https://github.com/apache/impala/blob/ae6a3b9ec058dfea4b4f93d4828761f792f0b55e/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q31.test#L1319-L1323]
> [https://github.com/apache/impala/blob/ae6a3b9ec058dfea4b4f93d4828761f792f0b55e/testdata/workloads/functional-planner/queries/PlannerTest/tpcds_cpu_cost/tpcds-q51.test#L511-L515]
> Planner should be aware to not estimate terabytes/petabytes of memory for 
> SORT node, knowing that SORT node has ability to spill-to-disk under memory 
> pressure. Planner can also take account for SORT_RUN_BYTES_LIMIT or 
> MAX_SORT_RUN_SIZE option value to come up with lower memory estimate.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to