[ 
https://issues.apache.org/jira/browse/KAFKA-6906?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16477993#comment-16477993
 ] 

Guozhang Wang commented on KAFKA-6906:
--------------------------------------

[~mjsax] EOS does not guarantee determinism when we execute multiple times, so 
I think we should not disallow wall-clock punctuation when EOS turned on.

As for the fix, I think we should get rid of {{commitOffsetNeeded}} and always 
call {{StreamTask#commitOffsets}}, the motivations is that:

1. When EOS is turned on, we need to make sure {{StreamTask#commitOffsets}} is 
called even if there is no records being processed at all (including the ones 
from wall-clock punctuate).
2. Without KIP-211, currently the offsets may be deleted even if the group is 
still valid after the specified retention period if there is no new offset 
commits during that period.

What we could change, though, is that inside the {{StreamTask#commitOffsets}} 
we check the following:

For each topic partition in {{consumedOffsets}, if there is a change since the 
last commit; and filter those topic partitions if no changes are made when 
calling {{commitSync}}. After KIP-211 this should be safe to do.


> Kafka Streams does not commit transactions if data is produced via wall-clock 
> punctuation
> -----------------------------------------------------------------------------------------
>
>                 Key: KAFKA-6906
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6906
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 1.1.0
>            Reporter: Matthias J. Sax
>            Assignee: Jagadesh Adireddi
>            Priority: Major
>
> Committing in Kafka Streams happens in regular intervals. However, committing 
> only happens if new input records got processed since the last commit (via 
> setting flag `commitOffsetNeeded` within `StreamTask#process()`)
> However, data could also be emitted via wall-clock based punctuation calls. 
> Especially if EOS is enabled, this is an issue (maybe also for non-EOS) 
> because the current running transaction is not committed and thus might time 
> out leading to a fatal error.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to