[
https://issues.apache.org/jira/browse/IMPALA-13644?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Riza Suminto resolved IMPALA-13644.
-----------------------------------
Fix Version/s: Impala 4.5.0
Resolution: Fixed
> Generalize and move getPerInstanceNdvForCpuCosting into AggregationNode.
> ------------------------------------------------------------------------
>
> Key: IMPALA-13644
> URL: https://issues.apache.org/jira/browse/IMPALA-13644
> Project: IMPALA
> Issue Type: Bug
> Components: Frontend
> Affects Versions: Impala 4.4.0
> Reporter: Riza Suminto
> Assignee: Riza Suminto
> Priority: Major
> Fix For: Impala 4.5.0
>
>
> getPerInstanceNdvForCpuCosting is a method to estimate the number of distinct
> values of exprs per fragment instance when accounting for the likelihood of
> duplicate keys across fragment instances. It borrows the probabilistic model
> from formula described in IMPALA-2945. This method is exclusively used by
> AggregationNode only.
> [https://github.com/apache/impala/blob/99529db6ad62ddc34cbfd924d7e41b1fce5b60a2/fe/src/main/java/org/apache/impala/planner/PlanFragment.java#L634-L642]
>
> We should move this method to AggregationNode and generalize it to accept NDV
> estimate calculated at AggregationNode.computeStats() as input. The number
> from computeStats should be more precise now after improvement from
> IMPALA-13405, IMPALA-13526, and IMPALA-13622.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)