[
https://issues.apache.org/jira/browse/FLINK-5768?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15888235#comment-15888235
]
ASF GitHub Bot commented on FLINK-5768:
---------------------------------------
Github user fhueske commented on a diff in the pull request:
https://github.com/apache/flink/pull/3423#discussion_r103474797
--- Diff:
flink-libraries/flink-table/src/main/scala/org/apache/flink/table/runtime/aggregate/DataSetSessionWindowAggregateCombineGroupFunction.scala
---
@@ -79,44 +85,60 @@ class DataSetSessionWindowAggregateCombineGroupFunction(
// calculate the current window and open a new window.
if (windowEnd != null) {
// emit the current window's merged data
- doCollect(out, windowStart, windowEnd)
+ doCollect(out, accumulatorList, windowStart, windowEnd)
+
+ // clear the accumulator list for all aggregate
+ for (i <- aggregates.indices) {
+ accumulatorList(i).clear()
+ }
} else {
// set group keys to aggregateBuffer.
for (i <- groupingKeys.indices) {
aggregateBuffer.setField(i, record.getField(i))
}
}
- // initiate intermediate aggregate value.
- aggregates.foreach(_.initiate(aggregateBuffer))
windowStart = record.getField(rowTimeFieldPos).asInstanceOf[Long]
}
- // merge intermediate aggregate value to the buffered value.
- aggregates.foreach(_.merge(record, aggregateBuffer))
+ // collect the accumulators for each aggregate
+ for (i <- aggregates.indices) {
--- End diff --
We cannot collect all accumulator and need to merge pairwise.
I think it would be good to remove the preparation mapper and use
`accumulate()` here but, this would result in even more significant code
changes.
> Apply new aggregation functions for datastream and dataset tables
> -----------------------------------------------------------------
>
> Key: FLINK-5768
> URL: https://issues.apache.org/jira/browse/FLINK-5768
> Project: Flink
> Issue Type: Sub-task
> Components: Table API & SQL
> Reporter: Shaoxuan Wang
> Assignee: Shaoxuan Wang
>
> Apply new aggregation functions for datastream and dataset tables
> This includes:
> 1. Change the implementation of the DataStream aggregation runtime code to
> use new aggregation functions and aggregate dataStream API.
> 2. DataStream will be always running in incremental mode, as explained in
> 06/Feb/2017 in FLINK5564.
> 2. Change the implementation of the Dataset aggregation runtime code to use
> new aggregation functions.
> 3. Clean up unused class and method.
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)