[ 
https://issues.apache.org/jira/browse/FLINK-38740?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

dylanhz updated FLINK-38740:
----------------------------
    Description: 
{code:sql}
select stddev_pop(x) from (values (cast(0.27 as double)), (cast(0.27 as 
double)), (cast(0.27 as double))) as T(x)
-- NaN

select var_pop(x) from (values (cast(0.27 as double)), (cast(0.27 as double)), 
(cast(0.27 as double))) as T(x)
-- -9.25185853854297E-18
{code}

Currently, Flink uses AggregateReduceFunctionsRule of Calcite to reduce 
STDDEV_POP/STDDEV_SAMP/VAR_POP/VAR_SAMP to some SUMs and COUNTs. For example:
{code:text}
STDDEV_POP(x) = SQRT( (SUM(x * x) - SUM(x) * SUM(x) / COUNT(x)) / COUNT(x))
{code}

This algorithm is known as a naive one-pass for variance calculation. However, 
it suffers from huge precision loss 

  was:
{code:sql}
select stddev_pop(x) from (values (cast(0.27 as double)), (cast(0.27 as 
double)), (cast(0.27 as double))) as T(x)
-- NaN
{code}


> STDDEV related functions may produce invalid results
> ----------------------------------------------------
>
>                 Key: FLINK-38740
>                 URL: https://issues.apache.org/jira/browse/FLINK-38740
>             Project: Flink
>          Issue Type: Bug
>          Components: Table SQL / API
>            Reporter: dylanhz
>            Priority: Major
>
> {code:sql}
> select stddev_pop(x) from (values (cast(0.27 as double)), (cast(0.27 as 
> double)), (cast(0.27 as double))) as T(x)
> -- NaN
> select var_pop(x) from (values (cast(0.27 as double)), (cast(0.27 as 
> double)), (cast(0.27 as double))) as T(x)
> -- -9.25185853854297E-18
> {code}
> Currently, Flink uses AggregateReduceFunctionsRule of Calcite to reduce 
> STDDEV_POP/STDDEV_SAMP/VAR_POP/VAR_SAMP to some SUMs and COUNTs. For example:
> {code:text}
> STDDEV_POP(x) = SQRT( (SUM(x * x) - SUM(x) * SUM(x) / COUNT(x)) / COUNT(x))
> {code}
> This algorithm is known as a naive one-pass for variance calculation. 
> However, it suffers from huge precision loss 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to