[ 
https://issues.apache.org/jira/browse/KAFKA-8315?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16831747#comment-16831747
 ] 

Andrew edited comment on KAFKA-8315 at 5/2/19 4:40 PM:
-------------------------------------------------------

I should also mention my use case from the original slack community 
conversation :

I am performing a large historical inner join (2 years) of two streams (using 
event time), followed by an aggregation.

For the join, I have : 2 days of join window into the past only, with a grace 
period of 2 days ( I dont want to accept updates to the aggregation beyond this 
grace period).
 For the grouped aggregation I have : a tumbling window of 1 second and a grace 
of 4 days

For the grouped aggregation if I also set the group retention using 
Materialized, I can see that this affects the retention period of the 
underlying KSTREAM-AGGREGATE-STATE-STORE topics. This seems to be independent 
of the grace period.

However, using `until()` for the JoinWindows does not do the equvalent for the 
KSTREAM-JOINTHIS and KSTREAM-JOINOTHER topics, as I would have expected. These 
topics always have 120 hours retention period set on the topic.

What I see is that I get no aggregation records other than for the most recent 
120 hour period. So the vast majority of my 2 years fails to be 
joined/aggregated, and outputs nothing.


was (Author: the4thamigo_uk):
I should also mention my use case from the original slack community 
conversation :

I am performing a large historical inner join (2 years) of two streams (using 
event time), followed by an aggregation.

For the join, I have : 2 days of join window into the past only, with a grace 
period of 2 days ( I dont want to accept updates to the aggregation beyond this 
grace period).
 For the grouped aggregation I have : a tumbling window of 1 second and a grace 
of 4 days

For the grouped aggregation if I also set the group retention using 
Materialized, I can see that this affects the retention period of the 
underlying KSTREAM-AGGREGATE-STATE-STORE topics. This seems to be independent 
of the grace period.

However, using `until()` for the JoinWindows does not do the same for the 
KSTREAM-JOINTHIS and KSTREAM-JOINOTHER topics, as I would have expected. These 
topics always have 120 hours retention period set on the topic.

What I see is that I get no aggregation records other than for the most recent 
120 hour period. So the vast majority of my 2 years fails to be 
joined/aggregated, and outputs nothing.

> Cannot pass Materialized into a join operation
> ----------------------------------------------
>
>                 Key: KAFKA-8315
>                 URL: https://issues.apache.org/jira/browse/KAFKA-8315
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: Andrew
>            Assignee: John Roesler
>            Priority: Major
>
> The documentation says to use `Materialized` not `JoinWindows.until()` 
> (https://kafka.apache.org/22/javadoc/org/apache/kafka/streams/kstream/JoinWindows.html#until-long-),
>  but there is no where to pass a `Materialized` instance to the join 
> operation, only to the group operation is supported it seems.
>  
> Slack conversation here : 
> https://confluentcommunity.slack.com/archives/C48AHTCUQ/p1556799561287300



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to