GitHub user fhueske opened a pull request:
https://github.com/apache/flink/pull/5706
[FLINK-8903] [table] Fix VAR_SAMP, VAR_POP, STDEV_SAMP, STDEV_POP functions
on GROUP BY windows.
## What is the purpose of the change
* Fixes the computation of `VAR_SAMP`, `VAR_POP`, `STDDEV_SAMP`,
`STDDEV_POP` aggregations in the context of `GROUP BY` windows (`TUMBLE`,
`HOP`, `SESSION`). Right now, these methods are computed as `AVG`.
## Brief change log
* copy Calcite's `AggregateReduceFunctionsRule` to Flink and improve its
extensibility
* add a `WindowAggregateReduceFunctionsRule` based on the copied
`AggregateReduceFunctionsRule` to decompose the faulty aggregation functions
into `COUNT` and `SUM` functions.
* add restriction to `FlinkLogicalWindowAggregateConverter` to prevent
translation of group window aggregates with failing aggregation functions
* prevent translation of `VAR_SAMP`, `VAR_POP`, `STDDEV_SAMP`, `STDDEV_POP`
in `AggregateUtil`
* add unit tests (plan validation) for batch (SQL, Table API) and stream
(SQL, Table API)
## Verifying this change
* run the added plan tests
## Does this pull request potentially affect one of the following parts:
- Dependencies (does it add or upgrade a dependency): **no**
- The public API, i.e., is any changed class annotated with
`@Public(Evolving)`: **no**
- The serializers: **no**
- The runtime per-record code paths (performance sensitive): **no**
- Anything that affects deployment or recovery: JobManager (and its
components), Checkpointing, Yarn/Mesos, ZooKeeper: **no**
- The S3 file system connector: **no**
## Documentation
- Does this pull request introduce a new feature? **no**
- If yes, how is the feature documented? **n/a**
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/fhueske/flink tableVarStddevAggFix
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/5706.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #5706
----
commit 517567348b0ec0c23ef0c1dcc05c54a91d5c5671
Author: Fabian Hueske <fhueske@...>
Date: 2018-03-15T20:04:00Z
[FLINK-8903] [table] Fix VAR_SAMP, VAR_POP, STDEV_SAMP, STDEV_POP functions
on GROUP BY windows.
----
---