[
https://issues.apache.org/jira/browse/FLINK-37859?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Gustavo de Morais updated FLINK-37859:
--------------------------------------
Description:
We currently always rely on a chain of binary joins operators for multiple
non-temporal regular joins in a flink streaming job. This often generates a lot
of intermediate state which considerably increases the state size and leads to
decrease performance and performance degradation.
[FLIP-516|https://cwiki.apache.org/confluence/display/FLINK/FLIP-516%3A+Multi-Way+Join+Operator]
proposes and implements an operator that performs a join accross N inputs with
zero intermediate state and better performance for pipelines with records
amplification. This is a challenging task since real-world implementations of a
MultiJoin operator for a changelog stream processor [are not known and the
algorithm itself is
complicated.|https://issues.apache.org/jira/browse/SPARK-2215?page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel&focusedCommentId=14038513]
was:
We currently always rely on a chain of binary joins operators for multiple
non-temporal regular joins in a flink streaming job. This often generates a lot
of intermediate state which considerably increases the state size and leads to
decrease performance and performance degradation.
[FLIP-516|https://cwiki.apache.org/confluence/display/FLINK/FLIP-516%3A+Multi-Way+Join+Operator]
proposes and implements an operator that performs a join accross N inputs with
zero intermediate state and better performance for pipelines with records
amplification.
> FLIP-516: Streaming Multi-Way Join Optimization
> -----------------------------------------------
>
> Key: FLINK-37859
> URL: https://issues.apache.org/jira/browse/FLINK-37859
> Project: Flink
> Issue Type: New Feature
> Reporter: dalongliu
> Assignee: Gustavo de Morais
> Priority: Major
>
> We currently always rely on a chain of binary joins operators for multiple
> non-temporal regular joins in a flink streaming job. This often generates a
> lot of intermediate state which considerably increases the state size and
> leads to decrease performance and performance degradation.
>
> [FLIP-516|https://cwiki.apache.org/confluence/display/FLINK/FLIP-516%3A+Multi-Way+Join+Operator]
> proposes and implements an operator that performs a join accross N inputs
> with zero intermediate state and better performance for pipelines with
> records amplification. This is a challenging task since real-world
> implementations of a MultiJoin operator for a changelog stream processor [are
> not known and the algorithm itself is
> complicated.|https://issues.apache.org/jira/browse/SPARK-2215?page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel&focusedCommentId=14038513]
--
This message was sent by Atlassian Jira
(v8.20.10#820010)