[ 
https://issues.apache.org/jira/browse/CALCITE-4997?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17486860#comment-17486860
 ] 

xiejiajun commented on CALCITE-4997:
------------------------------------

I looked up some information and opened it in the dialect with this function:
- Hive:https://hivemall.incubator.apache.org/userguide/misc/approx.html
- 
Spark:https://spark.apache.org/docs/3.1.2/sql-ref-functions-builtin.html#aggregate-functions
- 
BigQuery:https://cloud.google.com/bigquery/docs/reference/standard-sql/approximate_aggregate_functions
- Oralce:https://docs.oracle.com/database/121/SQLRF/functions013.htm#SQLRF56900
- 
Snowlake:https://docs.snowflake.com/en/sql-reference/functions/approx_count_distinct.html
- 
Presto:https://prestodb.io/docs/current/functions/aggregate.html?highlight=approx_distinct#approximate-aggregate-functions

For presto, it‘s function name is slightly different.
For snowflake, it is not in SqlDialectFactoryimpl#simple (there are 34 
instances of DatabaseProduct, so I think it may have been omitted before)

> approx_count_distinct function was incorrectly converted to count(distinct )
> ----------------------------------------------------------------------------
>
>                 Key: CALCITE-4997
>                 URL: https://issues.apache.org/jira/browse/CALCITE-4997
>             Project: Calcite
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 1.29.0
>            Reporter: xiejiajun
>            Priority: Major
>              Labels: pull-request-available
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> {code:java}
> SELECT APPROX_COUNT_DISTINCT(product_id)
> FROM foodmart.product
> {code}
> will be
> {code:java}
> SELECT COUNT(DISTINCT product_id)
> FROM foodmart.product
> {code}
> This can cause many tasks to run too slowly.



--
This message was sent by Atlassian Jira
(v8.20.1#820001)

Reply via email to