VictorvandenHoven commented on code in PR #15510:
URL: https://github.com/apache/kafka/pull/15510#discussion_r1522948296


##########
streams/src/main/java/org/apache/kafka/streams/kstream/internals/KStreamKStreamJoin.java:
##########
@@ -223,9 +223,9 @@ private void emitNonJoinedOuterRecords(
             try (final KeyValueIterator<TimestampedKeyAndJoinSide<K>, 
LeftOrRightValue<V1, V2>> it = store.all()) {
                 TimestampedKeyAndJoinSide<K> prevKey = null;
 
+                boolean outerJoinLeftBreak = false;
+                boolean outerJoinRightBreak = false;
                 while (it.hasNext()) {
-                    boolean outerJoinLeftBreak = false;
-                    boolean outerJoinRightBreak = false;

Review Comment:
   Ok, that's good.
   
   But we better not break the loop at the first timestamp which is too large.
   For instance when the first timestamp which is too large belongs to a left 
record, there still can be a timestamp for a right record that is not yet too 
large.  This is because left and right records can have different (asymmetric, 
0-sized) window sizes.
   
   So, I would only break the loop when we have found the first timestamp that 
is too large for one-side record and then the second timestamp that is too 
large for the other-side record.
   
   Unfortunately, this means that in the case of a left-join-call, there will 
only be found right-records in the outer join store and the loop will not break 
because it will try to find timestamps of left-records that are too large (that 
do not exist).
   
   Unless you can find out that we are processing records for a left-join-call 
(and not an outer-join-call), we could break the loop inmediately at the first 
timestamp of a record which is too large.
   
   What do you think?
   



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: jira-unsubscr...@kafka.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to