[
https://issues.apache.org/jira/browse/CALCITE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17697571#comment-17697571
]
Julian Hyde commented on CALCITE-5564:
--------------------------------------
It seems to me that the main difference is that BigQuery's {{PERCENTILE_CONT}}
(and \{{PERCENTILE_DISC}}) uses the {{OVER}} clause where the standard (e.g.
Postgres) version uses the {{{}WITHIN GROUP{}}}. There are a few implications:
* The BigQuery version looks like a windowed aggregate function even when
you're using it as an aggregate function (e.g. {{{}SELECT x, PERCENTILE_CONT(y,
PERCENTILE 50) OVER (ORDER BY z) FROM t GROUP BY a{}}})
* We'll need to be careful how we determine whether a query is an aggregate
query. Is {{SELECT x, PERCENTILE_CONT(y, PERCENTILE 50) OVER (ORDER BY z) FROM
t}} an aggregate query? In Postgres no, in BigQuery maybe?
* In Postgres and Calcite) I suspect that it is valid (and makes sense) to
have both an OVER and a WITHIN GROUP. For example, {{select
percentile_cont(0.6) within group (order by sal) over (order by hiredate
partition by deptno rows 2 preceding) from emp}} (there are queries similar to
this in
[redshift.iq|https://github.com/apache/calcite/blob/main/babel/src/test/resources/sql/redshift.iq]).
I think you need to explore the semantics in BigQuery and Postgres. Are there
any queries that are valid in both, and have different semantics?
> Add 2-argument support for PERCENTILE_CONT/DISC
> -----------------------------------------------
>
> Key: CALCITE-5564
> URL: https://issues.apache.org/jira/browse/CALCITE-5564
> Project: Calcite
> Issue Type: Improvement
> Reporter: Tanner Clary
> Assignee: Tanner Clary
> Priority: Major
>
> Calcite currently has implementations for the {{PERCENTILE_CONT}} and
> {{PERCENTILE_DISC}} functions. Their syntax may be found
> [here|https://learn.microsoft.com/en-us/sql/t-sql/functions/percentile-cont-transact-sql?view=sql-server-ver16].
>
> BigQuery offers these functions as well, but the syntax is slightly
> different, and may be found
> [here|https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#percentile_cont].
> The main difference is that instead of using a {{WITHIN GROUP}} clause, the
> array is passed in directly as the first argument to the function.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)