[ 
https://issues.apache.org/jira/browse/KAFKA-12909?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17522502#comment-17522502
 ] 

Matthias J. Sax edited comment on KAFKA-12909 at 4/21/22 9:13 PM:
------------------------------------------------------------------

{quote}It makes sense but it is still kind of hard to wrap my head around it.
{quote}
Assume you have the following input (and a join window of 10):

left: <k,v1,95>

right: <k,v2,100> <k,v3,105>

result: <k,<v1,v2>,100> <k, <v1,v3>, 105>

The second result record is no a replacement of the first result record. The 
result is really two records.

Doing the same with left-join (and eager emit) you get: result: 
<k,<v1,null>,95>, <k,<v1,v2>,100> <k, <v1,v3>, 105> – for this case, the first 
"left-join" is clearly incorrect, right? But how can you know that the second 
record is an update to the first one, while the third record is _no_ update to 
the second one?

Maybe also check out: 
[https://www.confluent.io/events/kafka-summit-europe-2021/temporal-joins-in-kafka-streams-and-ksqldb/]


was (Author: mjsax):
{quote}It makes sense but it is still kind of hard to wrap my head around it.
{quote}
Assume you have the following input (and a join window of 10):

left: <k,v1,100>

right: <k,v2,95> <k,v3,105>

result: <k,<v1,v2>,100> <k, <v1,v3>, 105>

The second result record is no a replacement of the first result record. The 
result is really two records.

Doing the same with left-join (and eager emit) you get: result: 
<k,<v1,null>,95>, <k,<v1,v2>,100> <k, <v1,v3>, 105> – for this case, the first 
"left-join" is clearly incorrect, right? But how can you know that the second 
record is an update to the first one, while the third record is _no_ update to 
the second one?

Maybe also check out: 
https://www.confluent.io/events/kafka-summit-europe-2021/temporal-joins-in-kafka-streams-and-ksqldb/

> Allow users to opt-into spurious left/outer stream-stream join improvement
> --------------------------------------------------------------------------
>
>                 Key: KAFKA-12909
>                 URL: https://issues.apache.org/jira/browse/KAFKA-12909
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: Matthias J. Sax
>            Assignee: Matthias J. Sax
>            Priority: Blocker
>             Fix For: 3.1.0
>
>
> https://issues.apache.org/jira/browse/KAFKA-10847 improves left/outer 
> stream-stream join, by not emitting left/outer results eagerly, but only 
> after the grace period passed.
> While this change is desired, there is an issue with regard to upgrades: if 
> users don't specify a grace period, we fall back to a 24h default. Thus, 
> left/outer join results would only be emitted 24h after the join window end. 
> This change in behavior could break existing applications when upgrading to 
> 3.0.0 release. – And even if users do set a grace period explicitly, it's 
> still unclear if the new delayed output behavior would work for them.
> Thus, we propose to disable the fix of KAFAK-10847 by default, and let user 
> opt-into the fix explicitly instead.
> To allow users to enable the fix, we want to piggy-back on KIP-633 
> (https://issues.apache.org/jira/browse/KAFKA-8613) that deprecated the 
> existing `JoinWindows.of()` and `JoinWindows#grace()` methods in favor of 
> `JoinWindows.ofSizeAndGrace()` – if users don't update their code, we would 
> keep the fix disabled, and thus, if users upgrade their app nothing changes. 
> Only if users switch to the new `ofSizeAndGrace()` API, we enable the fix and 
> thus give users the opportunity to opt-in expliclity and pick an appropriate 
> grace period for their application.



--
This message was sent by Atlassian Jira
(v8.20.7#820007)

Reply via email to