[
https://issues.apache.org/jira/browse/KAFKA-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Vahid Hashemian updated KAFKA-4682:
-----------------------------------
Labels: kip (was: )
> Committed offsets should not be deleted if a consumer is still active
> ---------------------------------------------------------------------
>
> Key: KAFKA-4682
> URL: https://issues.apache.org/jira/browse/KAFKA-4682
> Project: Kafka
> Issue Type: Bug
> Reporter: James Cheng
> Labels: kip
>
> Kafka will delete committed offsets that are older than
> offsets.retention.minutes
> If there is an active consumer on a low traffic partition, it is possible
> that Kafka will delete the committed offset for that consumer. Once the
> offset is deleted, a restart or a rebalance of that consumer will cause the
> consumer to not find any committed offset and start consuming from
> earliest/latest (depending on auto.offset.reset). I'm not sure, but a broker
> failover might also cause you to start reading from auto.offset.reset (due to
> broker restart, or coordinator failover).
> I think that Kafka should only delete offsets for inactive consumers. The
> timer should only start after a consumer group goes inactive. For example, if
> a consumer group goes inactive, then after 1 week, delete the offsets for
> that consumer group. This is a solution that [~junrao] mentioned in
> https://issues.apache.org/jira/browse/KAFKA-3806?focusedCommentId=15323521&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15323521
> The current workarounds are to:
> # Commit an offset on every partition you own on a regular basis, making sure
> that it is more frequent than offsets.retention.minutes (a broker-side
> setting that a consumer might not be aware of)
> or
> # Turn the value of offsets.retention.minutes up really really high. You have
> to make sure it is higher than any valid low-traffic rate that you want to
> support. For example, if you want to support a topic where someone produces
> once a month, you would have to set offsetes.retention.mintues to 1 month.
> or
> # Turn on enable.auto.commit (this is essentially #1, but easier to
> implement).
> None of these are ideal.
> #1 can be spammy. It requires your consumers know something about how the
> brokers are configured. Sometimes it is out of your control. Mirrormaker, for
> example, only commits offsets on partitions where it receives data. And it is
> duplication that you need to put into all of your consumers.
> #2 has disk-space impact on the broker (in __consumer_offsets) as well as
> memory-size on the broker (to answer OffsetFetch).
> #3 I think has the potential for message loss (the consumer might commit on
> messages that are not yet fully processed)
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)