[ https://issues.apache.org/jira/browse/KAFKA-5510?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16503907#comment-16503907 ]
Matthias J. Sax commented on KAFKA-5510: ---------------------------------------- Thanks [~guozhang]. With KIP-211 in place, your suggestion makes sense to me. [~wushujames]: KIP-211 is WIP for 2.0 and I hope (expect) it makes it... > Streams should commit all offsets regularly > ------------------------------------------- > > Key: KAFKA-5510 > URL: https://issues.apache.org/jira/browse/KAFKA-5510 > Project: Kafka > Issue Type: Bug > Components: streams > Reporter: Matthias J. Sax > Priority: Major > > Currently, Streams commits only offsets of partitions it did process records > for. Thus, if a partition does not have any data for longer then > {{offsets.retention.minutes}} (default 1 day) the latest committed offset > get's lost. On failure or restart {{auto.offset.rese}} kicks in potentially > resulting in reprocessing old data. > Thus, Streams should commit _all_ offset on a regular basis. Not sure what > the overhead of a commit is -- if it's too expensive to commit all offsets on > regular commit, we could also have a second config that specifies an > "commit.all.interval". > This relates to https://issues.apache.org/jira/browse/KAFKA-3806, so we > should sync to get a solid overall solution. > At the same time, it might be better to change the semantics of > {{offsets.retention.minutes}} in the first place. It might be better to apply > this setting only if the consumer group is completely dead (and not on "last > commit" and "per partition" basis). Thus, this JIRA would be a workaround fix > if core cannot be changed quickly enough. -- This message was sent by Atlassian JIRA (v7.6.3#76005)