[
https://issues.apache.org/jira/browse/FLINK-6243?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Flink Jira Bot updated FLINK-6243:
----------------------------------
Labels: auto-deprioritized-major auto-deprioritized-minor (was:
auto-deprioritized-major stale-minor)
Priority: Not a Priority (was: Minor)
This issue was labeled "stale-minor" 7 days ago and has not received any
updates so it is being deprioritized. If this ticket is actually Minor, please
raise the priority and ask a committer to assign you the issue or revive the
public discussion.
> Continuous Joins: True Sliding Window Joins
> --------------------------------------------
>
> Key: FLINK-6243
> URL: https://issues.apache.org/jira/browse/FLINK-6243
> Project: Flink
> Issue Type: New Feature
> Components: API / DataStream
> Affects Versions: 1.1.4
> Reporter: Elias Levy
> Priority: Not a Priority
> Labels: auto-deprioritized-major, auto-deprioritized-minor
>
> Flink defines sliding window joins as the join of elements of two streams
> that share a window of time, where the windows are defined by advancing them
> forward some amount of time that is less than the window time span. More
> generally, such windows are just overlapping hopping windows.
> Other systems, such as Kafka Streams, support a different notion of sliding
> window joins. In these systems, two elements of a stream are joined if the
> absolute time difference between the them is less or equal the time window
> length.
> This alternate notion of sliding window joins has some advantages in some
> applications over the current implementation.
> Elements to be joined may both fall within multiple overlapping sliding
> windows, leading them to be joined multiple times, when we only wish them to
> be joined once.
> The implementation need not instantiate window objects to keep track of
> stream elements, which becomes problematic in the current implementation if
> the window size is very large and the slide is very small.
> It allows for asymmetric time joins. E.g. join if elements from stream A are
> no more than X time behind and Y time head of an element from stream B.
> It is currently possible to implement a join with these semantics using
> {{CoProcessFunction}}, but the capability should be a first class feature,
> such as it is in Kafka Streams.
> To perform the join, elements of each stream must be buffered for at least
> the window time length. To allow for large window sizes and high volume of
> elements, the state, possibly optionally, should be buffered such as it can
> spill to disk (e.g. by using RocksDB).
> The same stream may be joined multiple times in a complex topology. As an
> optimization, it may be wise to reuse any element buffer among colocated join
> operators. Otherwise, there may write amplification and increased state that
> must be snapshotted.
--
This message was sent by Atlassian Jira
(v8.20.1#820001)