[ https://issues.apache.org/jira/browse/IMPALA-10445?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Quanlong Huang resolved IMPALA-10445. ------------------------------------- Fix Version/s: Impala 4.0 Resolution: Fixed > The ability to adjust NDV's precision with query option > ------------------------------------------------------- > > Key: IMPALA-10445 > URL: https://issues.apache.org/jira/browse/IMPALA-10445 > Project: IMPALA > Issue Type: Improvement > Components: Frontend > Affects Versions: Impala 4.0 > Reporter: Fifteen > Assignee: Fifteen > Priority: Minor > Fix For: Impala 4.0 > > > Since IMPALA-2658, we can trade memory for more accurate NDV estimation. It > is fascinating because tests showing error rate within 0.1% while no > tremendous resource usage rise is found( #registers is 2 << 18). Users may > have less complaint on computation precision in the future. > However, the road to apply high precision NDV to production environment is > uneven. > 1) We have to re-write sqls for a large number of historical workloads. Which > is time costing and is prone to error. > 2) Cluster users, aka sql writers, are reluctant to lower their expectations. > It would be more convenient to have a way for cluster admins to adjust > precision for each Admission Control queue according to cluster's resource > usage(rough world). > Propose: > Add a new query option DEFAULT_NDV_SCALE to change the default precision > setting for NDV() > Implementation: > # Add a query option in FE > # If the option is set, use the matching NDV(<expr>, <scale>) function > instead of NDV(). > > -- This message was sent by Atlassian Jira (v8.3.4#803005)