I’m not sure what Postgres has done, but it doesn’t make sense to derive the return type of PERCENTILE_DISC from the argument type, even though both are typically numeric types. (Strictly speaking, the argument to PERCENTILE_DISC can be any type that has a ‘-‘ operation, so it ought to work on DATE, TIMESTAMP, INTERVAL as well as numeric.)
To take an extreme example, if I am a geologist, the argument to PERCENTILE_DISC might be the age of my rock samples in millions of years, and the output will be a number between 0 and 1. I think your problem is that you don’t like the default data type returned by PERCENTILE_DISC. The solution would be to make the policy customizable via RelDataTypeSystem, as we did for other aggregate functions in https://issues.apache.org/jira/browse/CALCITE-1945 <https://issues.apache.org/jira/browse/CALCITE-1945>. Can you please log a Jira case for this, and we can discuss further there? Julian > On Aug 5, 2022, at 4:23 AM, Itiel Sadeh <[email protected]> wrote: > > Hi Julian, thanks for your email. > > I don't think those tickets will solve the problem as from what I > understand, they are referring to the input types, while my issue is with > the return type (although it might have the same root cause). > > As you can see in the code > <https://github.com/apache/calcite/blob/main/core/src/main/java/org/apache/calcite/sql/fun/SqlStdOperatorTable.java#L2304>, > right now the return type is hard coded as DOUBLE. The problem is that in > the following query: > >> select percentile_disc(0.12) within group(order by x) from t; > > The PERCENTILE_DISC argument will be only 0.12, and the argument "x" will > be outside of it (in the WITHIN_GROUP call). so the return type inference > cannot use the type of the column "x" when it's inferring the return type. > > Itiel > > On Thu, Aug 4, 2022 at 9:43 PM Julian Hyde <[email protected]> wrote: > >> I believer that PERCENTILE_DISC was implemented in >> https://issues.apache.org/jira/browse/CALCITE-4644 < >> https://issues.apache.org/jira/browse/CALCITE-4644>. There are open >> issues to support any sortable type [ >> https://issues.apache.org/jira/browse/CALCITE-4670 < >> https://issues.apache.org/jira/browse/CALCITE-4670> ] and also to change >> the implementation strategy [ >> https://issues.apache.org/jira/browse/CALCITE-4666 < >> https://issues.apache.org/jira/browse/CALCITE-4666> ]. >> >> Does your case map onto any of those? >> >> Is PERCENTILE_DISC able to deduce return types from its arguments today? >> If so, how does it do it? >> >> Julian >> >> >>> On Aug 3, 2022, at 1:40 PM, Itiel Sadeh <[email protected]> >> wrote: >>> >>> Hello, >>> >>> First of all I just wanted to thank all of you for the work you are doing >>> on Calcite. >>> >>> We want to add support for the percentile_disc aggregate function, but we >>> encounter a problem. >>> Our percentile_disc return type is dependent on the column of the "order >>> by" clause (just like postgresql >>> <https://www.postgresql.org/docs/current/functions-aggregate.html>). >>> However, I don't see how to achieve that on calcite. The issue is that >> the >>> sort column argument is not passed to the percentile_disc function, >> Rather, >>> it is stored outside of it as a collation. Therefore, I cannot use the >>> SqlReturnTypeInference mechanism. >>> >>> If someone has an idea on how to achieve this it would be very much >>> appreciated. >>> Thank you, >>> >>> Itiel >> >>
