[
https://issues.apache.org/jira/browse/CALCITE-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16107842#comment-16107842
]
Julian Hyde commented on CALCITE-1588:
--------------------------------------
As [~gian] points out, Oracle, BigQuery and MemSQL support
{{APPROX_COUNT_DISTINCT}}. I also see it in VoltDB.
A quick survey of other databases:
* Vertica has {{APPROXIMATE_COUNT_DISTINCT}}
* Redshift has {{[ APPROXIMATE ] COUNT ( [ DISTINCT | ALL ] * | expression )}}.
* In PostgreSQL you can bolt on your own hyperloglog function, but there
doesn't seem to have a unified approach.
* I don't see anything in DB2 or MySQL
I think that is a sufficient de facto standard to support
{{APPROX_COUNT_DISTINCT}} in Calcite.
We should also support an APPROXIMATE clause (for an aggregate function and for
SELECT). {{APPROX_COUNT_DISTINCT(x)}} would be syntactic sugar for
{{COUNT(DISTINCT x) APPROXIMATE ()}}.
I propose that we do {{APPROX_COUNT_DISTINCT}} first; don't yet add parser
support for {{APPROXIMATE}}, but do add an {{approximate}} field to
{{AggregateCall}}.
> Add SQL syntax to allow approximate LIMIT and distinct-COUNT
> ------------------------------------------------------------
>
> Key: CALCITE-1588
> URL: https://issues.apache.org/jira/browse/CALCITE-1588
> Project: Calcite
> Issue Type: Bug
> Reporter: Julian Hyde
> Assignee: Julian Hyde
>
> Add SQL syntax to allow approximate LIMIT and distinct-COUNT. These will set
> the properties specified in CALCITE-1587. By default the properties are
> false, so the query will return exact results.
> Exact syntax is to be decided. It could be at the top of the query (therefore
> affecting every LIMIT or aggregate in the query) or it could be more
> localized (e.g. {{COUNT(DISTINCT customerId) APPROXIMATE (WITHIN 10
> PERCENT)}}).
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)