Fifteen created IMPALA-10445:
--------------------------------
Summary: Control NDV's precision with query option
Key: IMPALA-10445
URL: https://issues.apache.org/jira/browse/IMPALA-10445
Project: IMPALA
Issue Type: Improvement
Components: Frontend
Affects Versions: Impala 4.0
Reporter: Fifteen
Since IMPALA-2658, we can trade memory for more accurate NDV estimation. It is
fascinating because tests showing error rate within 0.1% while no tremendous
resouce usage rise is found( #registers is 1 << 18). Users may have less
complaint on computation precision in the future.
However, the road to apply high precision NDV to production environment is
uneven.
1) We have to re-write sqls for a large number of historical workloads. Which
is time costing and is prone to error.
2) Cluster users, aka sql writers, are reluctant to lower their expectations.
It would be more convenient to have a way for cluster admins to adjust
precision for each Admission Control queue according to cluster's resource
usage(rough world).
Propose:
Add a new query option DEFAULT_NDV_PRECISION to change the default precision
setting for NDV()
Implementation:
# Add a query option in FE
# If the option is set, use the matching NDV(,P) function instead of NDV().
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]