[ 
https://issues.apache.org/jira/browse/KAFKA-4609?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15924093#comment-15924093
 ] 

Michael Noll commented on KAFKA-4609:
-------------------------------------

Care to elaborate [~damianguy]? Why does this happen only when record caching 
is enabled?

{quote}
When caching is enabled, KTable/KTable joins can result in duplicate values 
being emitted. This will occur if there were updates to the same key in both 
tables. Each table is flushed independently, and each table will trigger the 
join, so you get two results for the same key. 
{quote}


> KTable/KTable join followed by groupBy and aggregate/count can result in 
> incorrect results
> ------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-4609
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4609
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 0.10.1.1, 0.10.2.0
>            Reporter: Damian Guy
>            Assignee: Damian Guy
>              Labels: architecture
>
> When caching is enabled, KTable/KTable joins can result in duplicate values 
> being emitted. This will occur if there were updates to the same key in both 
> tables. Each table is flushed independently, and each table will trigger the 
> join, so you get two results for the same key. 
> If we subsequently perform a groupBy and then aggregate operation we will now 
> process these duplicates resulting in incorrect aggregated values. For 
> example count will be double the value it should be.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to