[ 
https://issues.apache.org/jira/browse/IMPALA-13644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Riza Suminto resolved IMPALA-13644.
-----------------------------------
    Fix Version/s: Impala 4.5.0
       Resolution: Fixed

> Generalize and move getPerInstanceNdvForCpuCosting into AggregationNode.
> ------------------------------------------------------------------------
>
>                 Key: IMPALA-13644
>                 URL: https://issues.apache.org/jira/browse/IMPALA-13644
>             Project: IMPALA
>          Issue Type: Bug
>          Components: Frontend
>    Affects Versions: Impala 4.4.0
>            Reporter: Riza Suminto
>            Assignee: Riza Suminto
>            Priority: Major
>             Fix For: Impala 4.5.0
>
>
> getPerInstanceNdvForCpuCosting is a method to estimate the number of distinct 
> values of exprs per fragment instance when accounting for the likelihood of 
> duplicate keys across fragment instances. It borrows the probabilistic model 
> from formula described in IMPALA-2945. This method is exclusively used by 
> AggregationNode only.
> [https://github.com/apache/impala/blob/99529db6ad62ddc34cbfd924d7e41b1fce5b60a2/fe/src/main/java/org/apache/impala/planner/PlanFragment.java#L634-L642]
>  
> We should move this method to AggregationNode and generalize it to accept NDV 
> estimate calculated at AggregationNode.computeStats() as input. The number 
> from computeStats should be more precise now after improvement from 
> IMPALA-13405, IMPALA-13526, and IMPALA-13622.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to