[jira] [Commented] (KAFKA-7224) KIP-328: Add spill-to-disk for Suppression

Maatari (Jira) Thu, 30 Apr 2020 11:35:14 -0700


    [ 
https://issues.apache.org/jira/browse/KAFKA-7224?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17096863#comment-17096863
 ]


Maatari commented on KAFKA-7224:
--------------------------------

What i call intermediate result, is in the following context. Let say you have 
the following topology 
{code:java}
ktable0.join(ktable1.groupby.reduce){code}
Where the reduce just act as the collectList in KSQL. This is a use case we 
have we need like this. There is a repartition topic at the groupby, and 
therefore you would emit, multiple time the same records, while the list 
collected with the reduce will keep increasing, until the entire topic is 
consume. This next generate, multiple results for join as well, as the same key 
on the right of the join will come multiple time. So you end up having 
systematic every growing version of records. That is what i call intermediate 
result. This is a way to build views on normalize data, that build entity with 
reference to all its outgoing links. We use to do that in our databases, but it 
was not scaling. 

> KIP-328: Add spill-to-disk for Suppression
> ------------------------------------------
>
>                 Key: KAFKA-7224
>                 URL: https://issues.apache.org/jira/browse/KAFKA-7224
>             Project: Kafka
>          Issue Type: Improvement
>          Components: streams
>            Reporter: John Roesler
>            Priority: Major
>
> As described in 
> [https://cwiki.apache.org/confluence/display/KAFKA/KIP-328%3A+Ability+to+suppress+updates+for+KTables]
> Following on KAFKA-7223, implement the spill-to-disk buffering strategy.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Commented] (KAFKA-7224) KIP-328: Add spill-to-disk for Suppression

Reply via email to