[jira] [Commented] (CALCITE-1588) Add SQL syntax to allow approximate LIMIT and distinct-COUNT

Julian Hyde (JIRA) Mon, 31 Jul 2017 12:40:13 -0700

    [ 
https://issues.apache.org/jira/browse/CALCITE-1588?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16107842#comment-16107842
 ]


Julian Hyde commented on CALCITE-1588:
--------------------------------------

As [~gian] points out, Oracle, BigQuery and MemSQL support 
{{APPROX_COUNT_DISTINCT}}. I also see it in VoltDB.

A quick survey of other databases:
* Vertica has {{APPROXIMATE_COUNT_DISTINCT}}
* Redshift has {{[ APPROXIMATE ] COUNT ( [ DISTINCT | ALL ] * | expression )}}.
* In PostgreSQL you can bolt on your own hyperloglog function, but there 
doesn't seem to have a unified approach.
* I don't see anything in DB2 or MySQL

I think that is a sufficient de facto standard to support 
{{APPROX_COUNT_DISTINCT}} in Calcite.

We should also support an APPROXIMATE clause (for an aggregate function and for 
SELECT). {{APPROX_COUNT_DISTINCT(x)}} would be syntactic sugar for 
{{COUNT(DISTINCT x) APPROXIMATE ()}}.

I propose that we do {{APPROX_COUNT_DISTINCT}} first; don't yet add parser 
support for {{APPROXIMATE}}, but do add an {{approximate}} field to 
{{AggregateCall}}.

> Add SQL syntax to allow approximate LIMIT and distinct-COUNT
> ------------------------------------------------------------
>
>                 Key: CALCITE-1588
>                 URL: https://issues.apache.org/jira/browse/CALCITE-1588
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: Julian Hyde
>            Assignee: Julian Hyde
>
> Add SQL syntax to allow approximate LIMIT and distinct-COUNT. These will set 
> the properties specified in CALCITE-1587. By default the properties are 
> false, so the query will return exact results.
> Exact syntax is to be decided. It could be at the top of the query (therefore 
> affecting every LIMIT or aggregate in the query) or it could be more 
> localized (e.g. {{COUNT(DISTINCT customerId) APPROXIMATE (WITHIN 10 
> PERCENT)}}).



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

[jira] [Commented] (CALCITE-1588) Add SQL syntax to allow approximate LIMIT and distinct-COUNT

Reply via email to