Filed: https://issues.apache.org/jira/browse/CALCITE-2030


> On Nov 1, 2017, at 1:20 PM, Julian Hyde <[email protected]> wrote:
> 
> Sounds plausible. You know far more of the details than I do.
> 
> Please log a JIRA case and let’s continue discussion there.
> 
>> On Nov 1, 2017, at 9:41 AM, Marc Prud'hommeaux <[email protected]> wrote:
>> 
>> 
>> Ahh, numeric with unspecified precision has the special meaning that it will 
>> retain whatever precision is stored in the column. From 
>> https://www.postgresql.org/docs/9.1/static/datatype-numeric.html#DATATYPE-NUMERIC-DECIMAL
>>  :
>> 
>> "Specifying: NUMERIC without any precision or scale creates a column in 
>> which numeric values of any precision and scale can be stored, up to the 
>> implementation limit on precision. A column of this kind will not coerce 
>> input values to any particular scale, whereas numeric columns with a 
>> declared scale will coerce input values to that scale. (The SQL standard 
>> requires a default scale of 0, i.e., coercion to integer precision. We find 
>> this a bit useless. If you're concerned about portability, always specify 
>> the precision and scale explicitly.)"
>> 
>> So even though the JDBC driver reports a precision of zero, it actually 
>> means arbitrary precision when it is on a numeric/decimal column. I'm 
>> guessing that extending SqlDialect.getCastSpec(RelDataType) in 
>> PostgresqlSqlDialect is the right place to fix this?
>> 
>>      -Marc
>> 
>> 
>> 
>>> On Oct 31, 2017, at 11:38 AM, Julian Hyde <[email protected]> wrote:
>>> 
>>> I don’t recall whether DECIMAL without precision + scale is even valid. 
>>> (Or, if Calcite treats it a “valid”, maybe Calcite is wrong, and should be 
>>> giving an error.)
>>> 
>>>> On Oct 31, 2017, at 7:07 AM, Marc Prud'hommeaux <[email protected]> wrote:
>>>> 
>>>> 
>>>> I understand, and I agree that the behavior is a sensible compromise. But 
>>>> in this case, price is a decimal column, and so the average call should 
>>>> also be a decimal, wheres it is being rounded to an integer (actually, a 
>>>> DECIMAL(19, 0) as you can see in the server log).
>>>> 
>>>> However, it looks like this might only be an issue with PostgreSQL:
>>>> 
>>>> 0: jdbc:calcite:schemaType=JDBC> select "price" from "products" limit 1;
>>>> +---------------------+
>>>> |        price        |
>>>> +---------------------+
>>>> | 25.99               | <-- price is a decimal...
>>>> +---------------------+
>>>> 1 row selected (0.217 seconds)
>>>> 
>>>> 0: jdbc:calcite:schemaType=JDBC> select avg("price") from "products";
>>>> +---------------------+
>>>> |       EXPR$0        |
>>>> +---------------------+
>>>> | 20                  | <-- ... but the average isn't
>>>> +---------------------+
>>>> 
>>>> 1 row selected (0.063 seconds)
>>>> 
>>>> 0: jdbc:calcite:schemaType=JDBC> select avg(cast("price" as decimal)) from 
>>>> "products";
>>>> +---------------------+
>>>> |       EXPR$0        |
>>>> +---------------------+
>>>> | 20                  | <-- casting to a non-precision decimal doesn't 
>>>> help...
>>>> +---------------------+
>>>> 1 row selected (0.067 seconds)
>>>> 
>>>> 0: jdbc:calcite:schemaType=JDBC> select avg(cast("price" as decimal(6,4))) 
>>>> from "products";
>>>> +--------+
>>>> | EXPR$0 |
>>>> +--------+
>>>> | 20.0151| <-- but specifying the precision does
>>>> +--------+
>>>> 1 row selected (0.066 seconds)
>>>> 
>>>> 
>>>> 
>>>> Perhaps Calcite is missing the precision of the column when it reads the 
>>>> metadata, or the driver is misreporting the precision? Because executing 
>>>> directly against the PostgreSQL driver yields the correct behavior:
>>>> 
>>>> 0: jdbc:postgresql://localhost/dvdstore> select avg(cast("price" as 
>>>> decimal)) from "products";
>>>> +-----------------------+
>>>> | 20.0151000000000000   |
>>>> +-----------------------+
>>>> 1 row selected (0.022 seconds)
>>>> 
>>>> 
>>>> I’ll dig a bit further and submit a PR if I can find a fix.
>>>> 
>>>> Thanks for your help!
>>>> 
>>>>    -Marc
>>>> 
>>>> 
>>>> 
>>>>> On Oct 30, 2017, at 11:16 AM, Julian Hyde <[email protected]> wrote:
>>>>> 
>>>>> The bug explains the current behavior. The behavior is not what everyone 
>>>>> would like, but it is what it is. I happen to like it because it is 
>>>>> simple. The behavior is this: if have a column c of type T, then AVG(c) 
>>>>> will have type T. If c is an INTEGER, then AVG will return an INTEGER. If 
>>>>> you cast that result to DOUBLE, surprise surprise, that DOUBLE has no 
>>>>> fractional part.
>>>>> 
>>>>> The solution is to convert the column before applying AVG: AVG(CAST(c AS 
>>>>> DOUBLE)) will return DOUBLE.
>>>>> 
>>>>> Julian
>>>>> 
>>>>> 
>>>>>> On Oct 30, 2017, at 5:54 AM, Marc Prud'hommeaux <[email protected]> wrote:
>>>>>> 
>>>>>> 
>>>>>> I had noticed that issue, but it purports to be closed for 1.14.0, which 
>>>>>> I am using.
>>>>>> 
>>>>>> It only seems to affect AVG; other aggregates don’t appear to be 
>>>>>> rounded. E.g.:
>>>>>> 
>>>>>> 0: jdbc:calcite:schemaType=JDBC> select min("price"), sum("price"), 
>>>>>> avg("price") from "products";
>>>>>> +---------------------+---------------------+---------------------+
>>>>>> |       EXPR$0        |       EXPR$1        |       EXPR$2        |
>>>>>> +---------------------+---------------------+---------------------+
>>>>>> | 9.99                | 200151.00           | 20                  |
>>>>>> +---------------------+---------------------+---------------------+
>>>>>> 
>>>>>> yields the server-side log:
>>>>>> 
>>>>>> 2017-10-30 08:16:00 EDT [4272-13] dgdemo@dvdstore LOG:  execute 
>>>>>> <unnamed>: SELECT MIN("price"), CASE WHEN COUNT(*) = 0 THEN NULL ELSE 
>>>>>> SUM("price") END, CAST(CASE WHEN COUNT(*) = 0 THEN NULL ELSE 
>>>>>> SUM("price") END / COUNT(*) AS DECIMAL(19, 0)) FROM “products"
>>>>>> 
>>>>>> Is this something I can get around by implementing my own 
>>>>>> RelDataTypeSystem? If so, I’ll experiment with that.
>>>>>> 
>>>>>>  -Marc
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>>> On Oct 29, 2017, at 7:11 PM, Julian Hyde <[email protected]> wrote:
>>>>>>> 
>>>>>>> See https://issues.apache.org/jira/browse/CALCITE-1945 
>>>>>>> <https://issues.apache.org/jira/browse/CALCITE-1945>.
>>>>>>> 
>>>>>>>> On Oct 29, 2017, at 3:31 PM, Marc Prud'hommeaux <[email protected]> 
>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>> 
>>>>>>>> When I run the following against a Calcite connection containing the 
>>>>>>>> PostgreSQL "dvdstore" sample database:
>>>>>>>> 
>>>>>>>> select avg(products.price) from dvdstore.products group by 
>>>>>>>> products.category
>>>>>>>> 
>>>>>>>> The following SQL is executed on the server:
>>>>>>>> 
>>>>>>>> SELECT CAST(SUM("price") / COUNT(*) AS DECIMAL(19, 0)) FROM "products" 
>>>>>>>> GROUP BY “category"
>>>>>>>> 
>>>>>>>> Is there some way I can prevent Calcite from rounding it (price is a 
>>>>>>>> decimal type)? Is there some reason it isn’t just sending the 
>>>>>>>> aggregate as an AVG?
>>>>>>>> 
>>>>>>>>        -Marc
>>>>>>>> 
>>>>>>> 
>>>>>> 
>>>>> 
>>>> 
>>> 
>> 
> 

Reply via email to