[
https://issues.apache.org/jira/browse/IMPALA-10445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Work on IMPALA-10445 started by Fifteen.
----------------------------------------
> The ability to adjust NDV's precision with query option
> -------------------------------------------------------
>
> Key: IMPALA-10445
> URL: https://issues.apache.org/jira/browse/IMPALA-10445
> Project: IMPALA
> Issue Type: Improvement
> Components: Frontend
> Affects Versions: Impala 4.0
> Reporter: Fifteen
> Assignee: Fifteen
> Priority: Minor
>
> Since IMPALA-2658, we can trade memory for more accurate NDV estimation. It
> is fascinating because tests showing error rate within 0.1% while no
> tremendous resource usage rise is found( #registers is 2 << 18). Users may
> have less complaint on computation precision in the future.
> However, the road to apply high precision NDV to production environment is
> uneven.
> 1) We have to re-write sqls for a large number of historical workloads. Which
> is time costing and is prone to error.
> 2) Cluster users, aka sql writers, are reluctant to lower their expectations.
> It would be more convenient to have a way for cluster admins to adjust
> precision for each Admission Control queue according to cluster's resource
> usage(rough world).
> Propose:
> Add a new query option DEFAULT_NDV_SCALE to change the default precision
> setting for NDV()
> Implementation:
> # Add a query option in FE
> # If the option is set, use the matching NDV(<expr>, <scale>) function
> instead of NDV().
>
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]