[
https://issues.apache.org/jira/browse/CALCITE-6166?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17800251#comment-17800251
]
Caican Cai edited comment on CALCITE-6166 at 12/25/23 6:44 AM:
---------------------------------------------------------------
[~julianhyde] I think this is difficult to do. I have done tests on mysql, tidb
and other databases, and they seem to be unable to guarantee 100% conversion to
the correct type. Here I think there is an issue that needs to be considered.
If the double type is used, This may cause loss of accuracy when the value
exceeds the range of double.
was (Author: JIRAUSER302115):
[~julianhyde] I think this is difficult to do. I have done tests on mysql, tidb
and other databases. They seem to be unable to guarantee 100% conversion to the
correct type. We can try to retain a few decimals, such as double/double.
> Optimization rule AggregateReduceFunctionsRule rewrites queries to
> semantically non-equivalent ones
> ---------------------------------------------------------------------------------------------------
>
> Key: CALCITE-6166
> URL: https://issues.apache.org/jira/browse/CALCITE-6166
> Project: Calcite
> Issue Type: Bug
> Components: core
> Affects Versions: 1.36.0
> Reporter: Pinhan Zhao
> Priority: Blocker
>
> The AggregateReduceFunctionsRule rewrites AVG to SUM divided by COUNT and
> applies a cast. However, the type being casted to is erroneously determined
> - it always converts the SUM / COUNT expression to INTEGER, which is
> inconsistent with the default return type of AVG originally. Therefore, if
> the average value originally evaluates to a non-integer number, the
> expression after rewriting will lost the floating point precision as being
> converted to an integer.
>
> This issue is manifested in various test cases
> ([https://github.com/apache/calcite/blob/08f685683e8e6e97b47189d95603ccccaf17051d/core/src/test/java/org/apache/calcite/test/RelOptRulesTest.java,]
> and expected outputs
> [https://github.com/apache/calcite/blob/08f685683e8e6e97b47189d95603ccccaf17051d/core/src/test/resources/org/apache/calcite/test/RelOptRulesTest.xml]).
>
>
> For example, the rule testReduceAverageAndSumWithNoReduceStddevAndVar
> rewrites the original query
> {code:java}
> select name, stddev_pop(deptno), avg(deptno), var_pop(deptno) from dept group
> by name{code}
> to
> {code:java}
> select name, stddev_pop(deptno), cast(coalesce(sum(deptno), 0) / count(*) as
> integer), var_pop(deptno) from dept group by name{code}
>
> However, with a counterexample
> {code:java}
> CREATE TABLE dept (
> deptno INTEGER PRIMARY KEY,
> name TEXT NOT NULL
> );
> INSERT INTO dept VALUES (1, 'Sales');
> INSERT INTO dept VALUES (2, 'Sales');
> {code}
>
> The original query and the optimized query produce different outputs:
> {code:java}
> name | stddev_pop | avg | var_pop
> -------+------------------------+--------------------+------------------------
> Sales | 0.50000000000000000000 | 1.5000000000000000 | 0.25000000000000000000
> (1 row)
> name | stddev_pop | int4 | var_pop
> -------+------------------------+------+------------------------
> Sales | 0.50000000000000000000 | 1 | 0.25000000000000000000
> (1 row) {code}
> Specifically, the values of avg (1.5 vs 1) are different.
>
> In addition to the above case, the following test cases are also affected:
> * testReduceAverageAndVarWithNoReduceStddev
> * testCastInAggregateReduceFunctions
> * testReduceAverage
> * testReduceWithNonTypePredicate
> * testReduceAverageWithNoReduceSum
> * testReduceAllAggregateFunctions
--
This message was sent by Atlassian Jira
(v8.20.10#820010)