opened a ticket in https://issues.apache.org/jira/browse/CALCITE-5230. Maybe I'm missing something, but I think that in your example, the percentile_disc will calculate the percentile that you specify and the output will be the appropriate value from the input column.
We can continue this discussion in the Jira ticket. Itiel On Fri, Aug 5, 2022 at 9:00 PM Julian Hyde <[email protected]> wrote: > I’m not sure what Postgres has done, but it doesn’t make sense to derive > the return type of PERCENTILE_DISC from the argument type, even though both > are typically numeric types. (Strictly speaking, the argument to > PERCENTILE_DISC can be any type that has a ‘-‘ operation, so it ought to > work on DATE, TIMESTAMP, INTERVAL as well as numeric.) > > To take an extreme example, if I am a geologist, the argument to > PERCENTILE_DISC might be the age of my rock samples in millions of years, > and the output will be a number between 0 and 1. > > I think your problem is that you don’t like the default data type returned > by PERCENTILE_DISC. The solution would be to make the policy customizable > via RelDataTypeSystem, as we did for other aggregate functions in > https://issues.apache.org/jira/browse/CALCITE-1945 < > https://issues.apache.org/jira/browse/CALCITE-1945>. > > Can you please log a Jira case for this, and we can discuss further there? > > Julian > > > On Aug 5, 2022, at 4:23 AM, Itiel Sadeh <[email protected]> > wrote: > > > > Hi Julian, thanks for your email. > > > > I don't think those tickets will solve the problem as from what I > > understand, they are referring to the input types, while my issue is with > > the return type (although it might have the same root cause). > > > > As you can see in the code > > < > https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/sql/fun/SqlStdOperatorTable.java#L2304 > >, > > right now the return type is hard coded as DOUBLE. The problem is that in > > the following query: > > > >> select percentile_disc(0.12) within group(order by x) from t; > > > > The PERCENTILE_DISC argument will be only 0.12, and the argument "x" will > > be outside of it (in the WITHIN_GROUP call). so the return type inference > > cannot use the type of the column "x" when it's inferring the return > type. > > > > Itiel > > > > On Thu, Aug 4, 2022 at 9:43 PM Julian Hyde <[email protected]> > wrote: > > > >> I believer that PERCENTILE_DISC was implemented in > >> https://issues.apache.org/jira/browse/CALCITE-4644 < > >> https://issues.apache.org/jira/browse/CALCITE-4644>. There are open > >> issues to support any sortable type [ > >> https://issues.apache.org/jira/browse/CALCITE-4670 < > >> https://issues.apache.org/jira/browse/CALCITE-4670> ] and also to > change > >> the implementation strategy [ > >> https://issues.apache.org/jira/browse/CALCITE-4666 < > >> https://issues.apache.org/jira/browse/CALCITE-4666> ]. > >> > >> Does your case map onto any of those? > >> > >> Is PERCENTILE_DISC able to deduce return types from its arguments today? > >> If so, how does it do it? > >> > >> Julian > >> > >> > >>> On Aug 3, 2022, at 1:40 PM, Itiel Sadeh <[email protected] > > > >> wrote: > >>> > >>> Hello, > >>> > >>> First of all I just wanted to thank all of you for the work you are > doing > >>> on Calcite. > >>> > >>> We want to add support for the percentile_disc aggregate function, but > we > >>> encounter a problem. > >>> Our percentile_disc return type is dependent on the column of the > "order > >>> by" clause (just like postgresql > >>> <https://www.postgresql.org/docs/current/functions-aggregate.html>). > >>> However, I don't see how to achieve that on calcite. The issue is that > >> the > >>> sort column argument is not passed to the percentile_disc function, > >> Rather, > >>> it is stored outside of it as a collation. Therefore, I cannot use the > >>> SqlReturnTypeInference mechanism. > >>> > >>> If someone has an idea on how to achieve this it would be very much > >>> appreciated. > >>> Thank you, > >>> > >>> Itiel > >> > >> > >
