[jira] [Resolved] (IMPALA-10445) The ability to adjust NDV's precision with query option

Quanlong Huang (Jira) Thu, 22 Apr 2021 02:18:05 -0700


     [ 
https://issues.apache.org/jira/browse/IMPALA-10445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Quanlong Huang resolved IMPALA-10445.
-------------------------------------
    Fix Version/s: Impala 4.0
       Resolution: Fixed

> The ability to adjust NDV's precision with query option
> -------------------------------------------------------
>
>                 Key: IMPALA-10445
>                 URL: https://issues.apache.org/jira/browse/IMPALA-10445
>             Project: IMPALA
>          Issue Type: Improvement
>          Components: Frontend
>    Affects Versions: Impala 4.0
>            Reporter: Fifteen
>            Assignee: Fifteen
>            Priority: Minor
>             Fix For: Impala 4.0
>
>
> Since IMPALA-2658, we can trade memory for more accurate NDV estimation. It 
> is fascinating because tests showing error rate within 0.1% while no 
> tremendous resource usage rise is found( #registers is 2 << 18). Users may 
> have less complaint on computation precision in the future.
> However, the road to apply high precision NDV to production environment is 
> uneven. 
> 1) We have to re-write sqls for a large number of historical workloads. Which 
> is time costing and is prone to error.
> 2) Cluster users, aka sql writers, are reluctant to lower their expectations. 
> It would be more convenient to have a way for cluster admins to adjust 
> precision for each Admission Control queue according to cluster's resource 
> usage(rough world).
> Propose:
> Add a new query option DEFAULT_NDV_SCALE to change the  default precision 
> setting for NDV() 
> Implementation:
>  # Add a query option in FE
>  # If the option is set, use the matching NDV(<expr>, <scale>) function 
> instead of NDV(). 
>  
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Resolved] (IMPALA-10445) The ability to adjust NDV's precision with query option

Reply via email to