Filed: https://issues.apache.org/jira/browse/CALCITE-2030
> On Nov 1, 2017, at 1:20 PM, Julian Hyde <[email protected]> wrote: > > Sounds plausible. You know far more of the details than I do. > > Please log a JIRA case and let’s continue discussion there. > >> On Nov 1, 2017, at 9:41 AM, Marc Prud'hommeaux <[email protected]> wrote: >> >> >> Ahh, numeric with unspecified precision has the special meaning that it will >> retain whatever precision is stored in the column. From >> https://www.postgresql.org/docs/9.1/static/datatype-numeric.html#DATATYPE-NUMERIC-DECIMAL >> : >> >> "Specifying: NUMERIC without any precision or scale creates a column in >> which numeric values of any precision and scale can be stored, up to the >> implementation limit on precision. A column of this kind will not coerce >> input values to any particular scale, whereas numeric columns with a >> declared scale will coerce input values to that scale. (The SQL standard >> requires a default scale of 0, i.e., coercion to integer precision. We find >> this a bit useless. If you're concerned about portability, always specify >> the precision and scale explicitly.)" >> >> So even though the JDBC driver reports a precision of zero, it actually >> means arbitrary precision when it is on a numeric/decimal column. I'm >> guessing that extending SqlDialect.getCastSpec(RelDataType) in >> PostgresqlSqlDialect is the right place to fix this? >> >> -Marc >> >> >> >>> On Oct 31, 2017, at 11:38 AM, Julian Hyde <[email protected]> wrote: >>> >>> I don’t recall whether DECIMAL without precision + scale is even valid. >>> (Or, if Calcite treats it a “valid”, maybe Calcite is wrong, and should be >>> giving an error.) >>> >>>> On Oct 31, 2017, at 7:07 AM, Marc Prud'hommeaux <[email protected]> wrote: >>>> >>>> >>>> I understand, and I agree that the behavior is a sensible compromise. But >>>> in this case, price is a decimal column, and so the average call should >>>> also be a decimal, wheres it is being rounded to an integer (actually, a >>>> DECIMAL(19, 0) as you can see in the server log). >>>> >>>> However, it looks like this might only be an issue with PostgreSQL: >>>> >>>> 0: jdbc:calcite:schemaType=JDBC> select "price" from "products" limit 1; >>>> +---------------------+ >>>> | price | >>>> +---------------------+ >>>> | 25.99 | <-- price is a decimal... >>>> +---------------------+ >>>> 1 row selected (0.217 seconds) >>>> >>>> 0: jdbc:calcite:schemaType=JDBC> select avg("price") from "products"; >>>> +---------------------+ >>>> | EXPR$0 | >>>> +---------------------+ >>>> | 20 | <-- ... but the average isn't >>>> +---------------------+ >>>> >>>> 1 row selected (0.063 seconds) >>>> >>>> 0: jdbc:calcite:schemaType=JDBC> select avg(cast("price" as decimal)) from >>>> "products"; >>>> +---------------------+ >>>> | EXPR$0 | >>>> +---------------------+ >>>> | 20 | <-- casting to a non-precision decimal doesn't >>>> help... >>>> +---------------------+ >>>> 1 row selected (0.067 seconds) >>>> >>>> 0: jdbc:calcite:schemaType=JDBC> select avg(cast("price" as decimal(6,4))) >>>> from "products"; >>>> +--------+ >>>> | EXPR$0 | >>>> +--------+ >>>> | 20.0151| <-- but specifying the precision does >>>> +--------+ >>>> 1 row selected (0.066 seconds) >>>> >>>> >>>> >>>> Perhaps Calcite is missing the precision of the column when it reads the >>>> metadata, or the driver is misreporting the precision? Because executing >>>> directly against the PostgreSQL driver yields the correct behavior: >>>> >>>> 0: jdbc:postgresql://localhost/dvdstore> select avg(cast("price" as >>>> decimal)) from "products"; >>>> +-----------------------+ >>>> | 20.0151000000000000 | >>>> +-----------------------+ >>>> 1 row selected (0.022 seconds) >>>> >>>> >>>> I’ll dig a bit further and submit a PR if I can find a fix. >>>> >>>> Thanks for your help! >>>> >>>> -Marc >>>> >>>> >>>> >>>>> On Oct 30, 2017, at 11:16 AM, Julian Hyde <[email protected]> wrote: >>>>> >>>>> The bug explains the current behavior. The behavior is not what everyone >>>>> would like, but it is what it is. I happen to like it because it is >>>>> simple. The behavior is this: if have a column c of type T, then AVG(c) >>>>> will have type T. If c is an INTEGER, then AVG will return an INTEGER. If >>>>> you cast that result to DOUBLE, surprise surprise, that DOUBLE has no >>>>> fractional part. >>>>> >>>>> The solution is to convert the column before applying AVG: AVG(CAST(c AS >>>>> DOUBLE)) will return DOUBLE. >>>>> >>>>> Julian >>>>> >>>>> >>>>>> On Oct 30, 2017, at 5:54 AM, Marc Prud'hommeaux <[email protected]> wrote: >>>>>> >>>>>> >>>>>> I had noticed that issue, but it purports to be closed for 1.14.0, which >>>>>> I am using. >>>>>> >>>>>> It only seems to affect AVG; other aggregates don’t appear to be >>>>>> rounded. E.g.: >>>>>> >>>>>> 0: jdbc:calcite:schemaType=JDBC> select min("price"), sum("price"), >>>>>> avg("price") from "products"; >>>>>> +---------------------+---------------------+---------------------+ >>>>>> | EXPR$0 | EXPR$1 | EXPR$2 | >>>>>> +---------------------+---------------------+---------------------+ >>>>>> | 9.99 | 200151.00 | 20 | >>>>>> +---------------------+---------------------+---------------------+ >>>>>> >>>>>> yields the server-side log: >>>>>> >>>>>> 2017-10-30 08:16:00 EDT [4272-13] dgdemo@dvdstore LOG: execute >>>>>> <unnamed>: SELECT MIN("price"), CASE WHEN COUNT(*) = 0 THEN NULL ELSE >>>>>> SUM("price") END, CAST(CASE WHEN COUNT(*) = 0 THEN NULL ELSE >>>>>> SUM("price") END / COUNT(*) AS DECIMAL(19, 0)) FROM “products" >>>>>> >>>>>> Is this something I can get around by implementing my own >>>>>> RelDataTypeSystem? If so, I’ll experiment with that. >>>>>> >>>>>> -Marc >>>>>> >>>>>> >>>>>> >>>>>>> On Oct 29, 2017, at 7:11 PM, Julian Hyde <[email protected]> wrote: >>>>>>> >>>>>>> See https://issues.apache.org/jira/browse/CALCITE-1945 >>>>>>> <https://issues.apache.org/jira/browse/CALCITE-1945>. >>>>>>> >>>>>>>> On Oct 29, 2017, at 3:31 PM, Marc Prud'hommeaux <[email protected]> >>>>>>>> wrote: >>>>>>>> >>>>>>>> >>>>>>>> When I run the following against a Calcite connection containing the >>>>>>>> PostgreSQL "dvdstore" sample database: >>>>>>>> >>>>>>>> select avg(products.price) from dvdstore.products group by >>>>>>>> products.category >>>>>>>> >>>>>>>> The following SQL is executed on the server: >>>>>>>> >>>>>>>> SELECT CAST(SUM("price") / COUNT(*) AS DECIMAL(19, 0)) FROM "products" >>>>>>>> GROUP BY “category" >>>>>>>> >>>>>>>> Is there some way I can prevent Calcite from rounding it (price is a >>>>>>>> decimal type)? Is there some reason it isn’t just sending the >>>>>>>> aggregate as an AVG? >>>>>>>> >>>>>>>> -Marc >>>>>>>> >>>>>>> >>>>>> >>>>> >>>> >>> >> >
