lincoln lee created FLINK-38400:
-----------------------------------

             Summary: STDDEV/VAR function with FILTER condition may cause wrong 
result
                 Key: FLINK-38400
                 URL: https://issues.apache.org/jira/browse/FLINK-38400
             Project: Flink
          Issue Type: Bug
    Affects Versions: 2.1.0, 1.20.2
            Reporter: lincoln lee
            Assignee: lincoln lee
             Fix For: 2.2.0, 2.1.1, 1.20.4


As reported in https://issues.apache.org/jira/browse/CALCITE-7192, the 
following query may produce wrong result:

{code}
SELECT STDDEV_POP(salary) FILTER (WHERE salary > 1000) FROM employees;
{code}

the decomposition for STDDEV_POP will be:

{code}
SQRT((SUM(x * x) - SUM(x) * SUM(x) / COUNT(x))/ COUNT(x))
{code}

for the above example, the `SUM(salary * salary)` lost the filter `salary > 
1000`.
 
*Impact*
This affects all functions that use the `reduceStddev` method:
-**STDDEV_POP(x) FILTER (WHERE condition)**
- **STDDEV_SAMP(x) FILTER (WHERE condition)**
-**VAR_POP(x) FILTER (WHERE condition)**
- **VAR_SAMP(x) FILTER (WHERE condition)**
Before we bump up to calcite 1.41.0, we should port the fix first.

 



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to