[ 
https://issues.apache.org/jira/browse/CALCITE-6080?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17876910#comment-17876910
 ] 

Mihai Budiu commented on CALCITE-6080:
--------------------------------------

This is due to rounding errors in FP computations. When all input values are 
about the same the partial should be 0, but it's sometimes negative, and the 
sqrt complains. I think it's always safe to take sqrt(max(partial_result, 0)).

> The simplified form after applying AggregateReduceFunctionsRule is giving 
> wrong results for STDDEV, Covariance with double and decimal types.
> ---------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: CALCITE-6080
>                 URL: https://issues.apache.org/jira/browse/CALCITE-6080
>             Project: Calcite
>          Issue Type: Bug
>            Reporter: Dayakar M
>            Priority: Major
>
> The simplified form after applying AggregateReduceFunctionsRule is giving 
> wrong results for STDDEV, Covariance with double and decimal types.
> For example, after applying AggregateReduceFunctionsRule 
> {noformat}
> STDDEV_POP(x) -> SQRT((SUM(x * x) - SUM(x) * SUM(x) / COUNT(x)) / COUNT(x))
> {noformat}
> for x as double/decimal, it is giving wrong result which can be easily 
> reproducible with below simple java code
>  
> {code:java}
>     double input1 = 23.79d;
>     double o1 = input1 * input1;
>     System.out.println("ip*ip=" + o1);
>     double sum = o1 + o1 + o1;
>     System.out.println("Sum(ip*ip)="+sum);    double sum1 = input1 + input1 + 
> input1;
>     System.out.println("Sum(ip)="+sum1);
>     double sum2 = sum1 * sum1;
>     System.out.println("Sum(ip)*Sum(ip)="+ sum2);
>     double fin = sum2/3d;
>     System.out.println("Sum(ip)*Sum(ip)/3="+fin);
>     double fin1 = sum - fin;
>     System.out.println("Sum(ip*ip)-Sum(ip)*Sum(ip)/3=" + fin1); 
>     System.out.println("SQRT((Sum(ip*ip)-Sum(ip)*Sum(ip)/3)/3)=" + 
> Math.sqrt(fin1/3));{code}
> The output is
> {code:java}
> ip*ip=565.9640999999999
> Sum(ip*ip)=1697.8922999999998
> Sum(ip)=71.37
> Sum(ip)*Sum(ip)=5093.6769
> Sum(ip)*Sum(ip)/3=1697.8923000000002
> Sum(ip*ip)-Sum(ip)*Sum(ip)/3=-4.547473508864641E-13
> SQRT((Sum(ip*ip)-Sum(ip)*Sum(ip)/3)/3)=NaN {code}
> The final output should be *0.0* but here it is coming as {*}NaN{*}.
> So for double and decimal type data we should not simplify STDDEV, Covariance 
> functions as it leads to wrong results.
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to