[jira] [Commented] (CALCITE-5564) Add 2-argument support for PERCENTILE_CONT/DISC

Julian Hyde (Jira) Tue, 07 Mar 2023 10:19:05 -0800


    [ 
https://issues.apache.org/jira/browse/CALCITE-5564?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17697571#comment-17697571
 ]


Julian Hyde commented on CALCITE-5564:
--------------------------------------

It seems to me that the main difference is that BigQuery's {{PERCENTILE_CONT}} 
(and \{{PERCENTILE_DISC}}) uses the {{OVER}} clause where the standard (e.g. 
Postgres) version uses the {{{}WITHIN GROUP{}}}. There are a few implications:
 * The BigQuery version looks like a windowed aggregate function even when 
you're using it as an aggregate function (e.g. {{{}SELECT x, PERCENTILE_CONT(y, 
PERCENTILE 50) OVER (ORDER BY z) FROM t GROUP BY a{}}})
 * We'll need to be careful how we determine whether a query is an aggregate 
query. Is {{SELECT x, PERCENTILE_CONT(y, PERCENTILE 50) OVER (ORDER BY z) FROM 
t}} an aggregate query? In Postgres no, in BigQuery maybe?
 * In Postgres and Calcite) I suspect that it is valid (and makes sense) to 
have both an OVER and a WITHIN GROUP. For example, {{select 
percentile_cont(0.6) within group (order by sal) over (order by hiredate 
partition by deptno rows 2 preceding) from emp}} (there are queries similar to 
this in 
[redshift.iq|https://github.com/apache/calcite/blob/main/babel/src/test/resources/sql/redshift.iq]).

I think you need to explore the semantics in BigQuery and Postgres. Are there 
any queries that are valid in both, and have different semantics?

> Add 2-argument support for PERCENTILE_CONT/DISC
> -----------------------------------------------
>
>                 Key: CALCITE-5564
>                 URL: https://issues.apache.org/jira/browse/CALCITE-5564
>             Project: Calcite
>          Issue Type: Improvement
>            Reporter: Tanner Clary
>            Assignee: Tanner Clary
>            Priority: Major
>
> Calcite currently has implementations for the {{PERCENTILE_CONT}} and 
> {{PERCENTILE_DISC}} functions. Their syntax may be found 
> [here|https://learn.microsoft.com/en-us/sql/t-sql/functions/percentile-cont-transact-sql?view=sql-server-ver16].
>  
> BigQuery offers these functions as well, but the syntax is slightly 
> different, and may be found 
> [here|https://cloud.google.com/bigquery/docs/reference/standard-sql/functions-and-operators#percentile_cont].
>  The main difference is that instead of using a {{WITHIN GROUP}} clause, the 
> array is passed in directly as the first argument to the function.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (CALCITE-5564) Add 2-argument support for PERCENTILE_CONT/DISC

Reply via email to