[
https://issues.apache.org/jira/browse/FLINK-11454?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16774586#comment-16774586
]
Rong Rong commented on FLINK-11454:
-----------------------------------
Yes [~fhueske]. I think your summary is absolutely on point.
1. OVER window without retraction would require a full apply over the iterable
element list (with or w/o aggregation) instead of using the retract method
since it is not available.
2. Regarding flexibility, Yes, slice and merge can be tw independent operator
(the implementation can share the same windowOperator) and just chained
together)
3. To add to the multiple partial results and merge method, out of top of my
head: a decent implementation approach will be how Google's S2, or
mathematically the Hilbert Curve R-tree, handles it via a hierarchy tree
approach. But this can be an optimization we do later.
One more question I am thinking (also shared in the mailing list): I was
actually making some investigations on same window vs. traditional window on an
operator standpoint. My observation was that the only difference is (1) how the
window states are stored (and the corresponding add and remove); and in the
OVER-aggregate perspective (2) how they are triggered. Maybe there's a way to
abstract the common components, like how the window functions was applied, and
how the states are cleaned up as window operator base, and implement both of
them as an extension.
If we think these are all valid discussion, I am going to start a discussion
google doc since it might be easier to collaborate, what do you guys think?
> Support MergedStream operation
> ------------------------------
>
> Key: FLINK-11454
> URL: https://issues.apache.org/jira/browse/FLINK-11454
> Project: Flink
> Issue Type: Sub-task
> Components: DataStream API
> Reporter: Rong Rong
> Assignee: Rong Rong
> Priority: Major
>
> {{Following SlicedStream, the mergedStream operator merges results from
> sliced stream and produces windowing results.
> {code:java}
> val slicedStream: SlicedStream = inputStream
> .keyBy("key")
> .sliceWindow(Time.seconds(5L)) // new “slice window” concept: to
> combine
> // tumble results based on discrete
> // non-overlapping windows.
> .aggregate(aggFunc)
> val mergedStream1: MergedStream = slicedStream
> .slideOver(Time.second(10L)) // combine slice results with same
>
> // windowing function, equivalent to
> // WindowOperator with an aggregate
> state
> // and derived aggregate function.
> val mergedStream2: MergedStream = slicedStream
> .slideOver(Count.of(5))
> .apply(windowFunction) // apply a different window function
> over
> // the sliced results.{code}
> MergedStream are produced by MergeOperator:
> {{slideOver}} and {{apply}} can be combined into a {{OVER AGGREGATE}}
> implementation similar to the one in TableAPI.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)