[
https://issues.apache.org/jira/browse/FLINK-18830?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17183991#comment-17183991
]
Aljoscha Krettek commented on FLINK-18830:
------------------------------------------
The documentation describes {{DataStream.join()}} as an _inner join_:
https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/operators/joining.html.
The relevant excerpt is:
{quote}Some notes on semantics:
* The creation of pairwise combinations of elements of the two streams behaves
like an inner-join, meaning elements from one stream will not be emitted if
they don’t have a corresponding element from the other stream to be joined with.
{quote}
Is this about using {{JoinCoGroupFunction}} in other context or about changing
the behaviour of the DataStream API?
> JoinCoGroupFunction and FlatJoinCoGroupFunction work incorrectly for outer
> join when one side of coGroup is empty
> -----------------------------------------------------------------------------------------------------------------
>
> Key: FLINK-18830
> URL: https://issues.apache.org/jira/browse/FLINK-18830
> Project: Flink
> Issue Type: Bug
> Components: API / DataStream
> Affects Versions: 1.11.1
> Reporter: liupengcheng
> Priority: Major
>
> Currently, The {{JoinCoGroupFunction}} and {{FlatJoinCoGroupFunction}} in
> JoinedStreams doesn't respect the join type, it's been implemented as doing
> join within a two-level loop. However, this is incorrect for outer join when
> one side of the coGroup is empty.
> {code}
> public void coGroup(Iterable<T1> first, Iterable<T2> second,
> Collector<T> out) throws Exception {
> for (T1 val1: first) {
> for (T2 val2: second) {
> wrappedFunction.join(val1, val2, out);
> }
> }
> }
> {code}
> The above code is the current implementation, suppose the first input is
> non-empty, and the second input is an empty iterator, then the join
> function(`wrappedFunction`) will never be called. This will cause no data to
> be emitted for a left outer join.
> So I propose to consider join type here, and handle this case, e.g., for left
> outer join, we can emit record with right side set to null here if the right
> side is empty or can not find any match in the right side.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)