I thought about this scenario as well.

However, my conclusion was that because __consumer_offsets is a compacted
topic, this extra clutter from short-lived consumer groups is negligible.

The disk size is the product of the number of consumer groups and the
number of partitions in the group's subscription. Typically I'd expect that
for short-lived consumer groups, that number < 100K.

The one area I wasn't sure of was how the group coordinator's in-memory
cache of offsets works. Is it a pull-through cache of unbounded size or
does it contain all offsets of all groups that use that broker as their
coordinator? If the latter, possibly there's an OOM risk there. If so,
might be worth investigating changing the cache design to a bounded size.

Also, switching to this design means that consumer groups no longer need to
commit all offsets, they only need to commit the ones that changed. I
expect in certain cases there will be broker-side performance gains due to
parsing smaller OffsetCommit requests. For example, due to some bad design
decisions we have some a couple of topics that have 1500 partitions of
which ~10% are regularly used. So 90% of the OffsetCommit request
processing is unnecessary.



On Wed, Nov 15, 2017 at 11:27 AM, Vahid S Hashemian <
vahidhashem...@us.ibm.com> wrote:

> I'm forwarding this feedback from John to the mailing list, and responding
> at the same time:
>
> John, thanks for the feedback. I agree that the scenario you described
> could lead to unnecessary long offset retention for other consumer groups.
> If we want to address that in this KIP we could either keep the
> 'retention_time' field in the protocol, or propose a per group retention
> configuration.
>
> I'd like to ask for feedback from the community on whether we should
> design and implement a per-group retention configuration as part of this
> KIP; or keep it simple at this stage and go with one broker level setting
> only.
> Thanks in advance for sharing your opinion.
>
> --Vahid
>
>
>
>
> From:   John Crowley <jdcrow...@gmail.com>
> To:     vahidhashem...@us.ibm.com
> Date:   11/15/2017 10:16 AM
> Subject:        [DISCUSS] KIP-211: Revise Expiration Semantics of Consumer
> Group Offsets
>
>
>
> Sorry for the clutter, first found KAFKA-3806, then -4682, and finally
> this KIP - they have more detail which I’ll avoid duplicating here.
>
> Think that not starting the expiration until all consumers have ceased,
> and clearing all offsets at the same time, does clean things up and solves
> 99% of the original issues - and 100% of my particular concern.
>
> A valid use-case may still have a periodic application - say production
> applications posting to Topics all week, and then a weekend batch job
> which consumes all new messages.
>
> Setting offsets.retention.minutes = 10 days does cover this but at the
> cost of extra clutter if there are other consumer groups which are truly
> created/used/abandoned on a frequent basis. Being able to set
> offsets.retention.minutes on a per groupId basis allows this to also be
> covered cleanly, and makes it visible that these groupIds are a special
> case.
>
> But relatively minor, and should not delay the original KIP.
>
> Thanks,
>
> John Crowley
>
>
>
>
>
>
>
>


-- 

*Jeff Widman*
jeffwidman.com <http://www.jeffwidman.com/> | 740-WIDMAN-J (943-6265)
<><

Reply via email to