[
https://issues.apache.org/jira/browse/CALCITE-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16106858#comment-16106858
]
Ethan Wang commented on CALCITE-1588:
-------------------------------------
regarding syntax {code}COUNT(DISTINCT customerId) APPROXIMATE (WITHIN 10
PERCENT)){code}
Seems to me Druid implemented the approx distinct count using HyperLogLog. In
HyperLogLog, I don't think it's common for user to specify the accuracy, since
the accuracy is implied in the algorithm and only related to a constant. So the
goal is always "as most accurate as possible". Is that true in Druid? [~gian]
> Add SQL syntax to allow approximate LIMIT and distinct-COUNT
> ------------------------------------------------------------
>
> Key: CALCITE-1588
> URL: https://issues.apache.org/jira/browse/CALCITE-1588
> Project: Calcite
> Issue Type: Bug
> Reporter: Julian Hyde
> Assignee: Julian Hyde
>
> Add SQL syntax to allow approximate LIMIT and distinct-COUNT. These will set
> the properties specified in CALCITE-1587. By default the properties are
> false, so the query will return exact results.
> Exact syntax is to be decided. It could be at the top of the query (therefore
> affecting every LIMIT or aggregate in the query) or it could be more
> localized (e.g. {{COUNT(DISTINCT customerId) APPROXIMATE (WITHIN 10
> PERCENT)}}).
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)