[ https://issues.apache.org/jira/browse/FLINK-6473?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Flink Jira Bot updated FLINK-6473: ---------------------------------- Labels: auto-deprioritized-major (was: stale-major) > Add OVER window support for batch tables > ---------------------------------------- > > Key: FLINK-6473 > URL: https://issues.apache.org/jira/browse/FLINK-6473 > Project: Flink > Issue Type: New Feature > Components: Table SQL / API > Reporter: Fabian Hueske > Priority: Major > Labels: auto-deprioritized-major > > Add support for OVER windows for batch tables. > Since OVER windows are supported for streaming tables, this issue is not > about the API (which is available) but about adding the execution strategies > and translation for OVER windows on batch tables. > The feature could be implemented using the following plans > *UNBOUNDED OVER* > {code} > DataSet[Row] input = ... > DataSet[Row] result = input > .groupBy(partitionKeys) > .sortGroup(orderByKeys) > .reduceGroup(computeAggregates) > {code} > This implementation is quite straightforward because we don't need to retract > rows. > *BOUNDED OVER* > A bit more challenging are BOUNDED OVER windows, because we need to retract > values from aggregates and we don't want to store rows temporarily on the > heap. > {code} > DataSet[Row] input = ... > DataSet[Row] sorted = input > .partitionByHash(partitionKey) > .sortPartition(partitionKeys, orderByKeys) > DataSet[Row] result = sorted.coGroup(sorted) > .where(partitionKey).equalTo(partitionKey) > .with(computeAggregates) > {code} > With this, the data set should be partitioned and sorted once. The sorted > {{DataSet}} would be consumed twice (the optimizer should inject a temp > barrier on one of the inputs to avoid a consumption deadlock). The > {{CoGroupFunction}} would accumulate new rows into the aggregates from one > input and retract them from the other. Since both input streams are properly > sorted, this can happen in a zigzag fashion. We need verify that the > generated plan is was we want it to be. -- This message was sent by Atlassian Jira (v8.3.4#803005)