[ 
https://issues.apache.org/jira/browse/KAFKA-4682?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16122498#comment-16122498
 ] 

Vahid Hashemian commented on KAFKA-4682:
----------------------------------------

[~wushujames] Thanks for your feedback. Regarding the other details you brought 
up:

. [~hachikuji]'s suggestion on 
[KIP-186|https://cwiki.apache.org/confluence/display/KAFKA/KIP-186%3A+Increase+offsets+retention+default+to+7+days]
 makes sense to me. The {{OffsetCommit}} API can be used to override the 
default broker level property {{offset.retention.minutes}} for specific 
group/topic/partitions. This means we probably wouldn't need to have a 
group-level retention config. What a potential KIP for this JIRA would be 
adding is that the retention timer kicks off at the moment the group becomes 
empty, and while the group is stable no offset will be removed (as retention 
timer is not ticking yet).
. Regarding your second point, I guess we could pick either method. It all 
would depend on the criteria for triggering the retention timer for a 
partition. If we trigger it when the group is empty (as in the previous bullet) 
then we would be expiring the offset for {{B-0}} with all other group 
partitions. If, on the other hand, we decide to trigger the timer when the 
partition stops being consumed within the group, then {{B-0}}'s offset could 
expire while the group is still active. I'm not sure how common this scenario 
is in real applications. If it's not that common perhaps it wouldn't cost a lot 
to keep {{B-0}}'s offsets around with the rest of the group. In any case, we 
should be able to pick one approach or the other depending on what you and 
others believe is more reasonable.

What do you think? [~hachikuji], what are your thoughts on this?

> Committed offsets should not be deleted if a consumer is still active
> ---------------------------------------------------------------------
>
>                 Key: KAFKA-4682
>                 URL: https://issues.apache.org/jira/browse/KAFKA-4682
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: James Cheng
>
> Kafka will delete committed offsets that are older than 
> offsets.retention.minutes
> If there is an active consumer on a low traffic partition, it is possible 
> that Kafka will delete the committed offset for that consumer. Once the 
> offset is deleted, a restart or a rebalance of that consumer will cause the 
> consumer to not find any committed offset and start consuming from 
> earliest/latest (depending on auto.offset.reset). I'm not sure, but a broker 
> failover might also cause you to start reading from auto.offset.reset (due to 
> broker restart, or coordinator failover).
> I think that Kafka should only delete offsets for inactive consumers. The 
> timer should only start after a consumer group goes inactive. For example, if 
> a consumer group goes inactive, then after 1 week, delete the offsets for 
> that consumer group. This is a solution that [~junrao] mentioned in 
> https://issues.apache.org/jira/browse/KAFKA-3806?focusedCommentId=15323521&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15323521
> The current workarounds are to:
> # Commit an offset on every partition you own on a regular basis, making sure 
> that it is more frequent than offsets.retention.minutes (a broker-side 
> setting that a consumer might not be aware of)
> or
> # Turn the value of offsets.retention.minutes up really really high. You have 
> to make sure it is higher than any valid low-traffic rate that you want to 
> support. For example, if you want to support a topic where someone produces 
> once a month, you would have to set offsetes.retention.mintues to 1 month. 
> or
> # Turn on enable.auto.commit (this is essentially #1, but easier to 
> implement).
> None of these are ideal. 
> #1 can be spammy. It requires your consumers know something about how the 
> brokers are configured. Sometimes it is out of your control. Mirrormaker, for 
> example, only commits offsets on partitions where it receives data. And it is 
> duplication that you need to put into all of your consumers.
> #2 has disk-space impact on the broker (in __consumer_offsets) as well as 
> memory-size on the broker (to answer OffsetFetch).
> #3 I think has the potential for message loss (the consumer might commit on 
> messages that are not yet fully processed)



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to