GitHub user ggevay opened a pull request:
https://github.com/apache/flink/pull/856
[FLINK-2143] Added ReduceFunctionWithInverse
(The first commit is the same as the second commit in PR 684)
I didn't add an overload to reduce, but instead created
ReduceFunctionWithInverse as a descendant of ReduceFunction. It has an
invReduce method, which should be the inverse of reduce. getWindowBuffer checks
if `transformation.getUDF() instanceof ReduceFunctionWithInverse` and creates
an InversePreReducer in this case (except when the policy is tumbling or
jumping).
I made SumAggregator implement ReduceFunctionWithInverse, so calculating
sums of sliding or even more general windows should get a speedup from this.
WindowIntegrationTest tests this. (FLINK-2144 will also have a few
ReduceFunctionWithInverse implementors.)
Grouped case: This is just creating an instance of the pre-reducer for each
new group that appears in the window, which is the same as how the
MedianGroupedPreReducer works. Thus, I created a GenericGroupedPreReducer that
does this with any non-grouped pre-reducer, and refactored the median code to
also use this.
I also added a missing clean and removed a superfluous one.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/ggevay/flink invReduce
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/856.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #856
----
commit 3424a6294cdda8f8b364aa68747d7890a0a9dbab
Author: Gabor Gevay <[email protected]>
Date: 2015-05-24T10:43:30Z
[FLINK-2145] [streaming] Fast calculation of medians of windows
commit b44d2d3323249ce5488cc2389f8fd53d3613b448
Author: Gabor Gevay <[email protected]>
Date: 2015-06-15T15:35:47Z
[FLINK-2143] [streaming] Added ReduceFunctionWithInverse, which sometimes
speeds up reduceWindow
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---