Hi Divij,

I think that the motivation is clear, however the ideal solution is not, at
least not for me. I would like to ensure that we solve the real problem
instead of making it worse. In our experience, is this issue usually due to
a mistake or a willingness to increase the number of consumers? In my mind,
in the current state, one should never change the number of partitions
because it results in losing group metadata. Preventing it would not be a
bad idea.

I agree that the ideal solution would be to change how we assign groups to
__consumer_offsets partitions. I have this idea of making groups a first
class resource in Kafka in the back of my mind for a while. This idea would
be to store group ids and their current partition in the controller and to
let the controller decide where a group should go when it is created. This
could be done via a plugin as well. If we have this, then adding new
__consumer_offsets partitions is no longer an issue. The controller would
start by filling the empty partitions when new groups are created. This
would have a few other advantages. For instance, it would allow us to put
quotas on the number of groups. It also has a few challenges. For instance,
how should a group be created - implicitly as today or explicitly? There is
also the question about the deletion. At the moment, groups are cleaned up
automatically after the grace period. Would we keep this?

I think that we should also consider the transaction coordinator in this
discussion because it suffers from the very same limitation. Ideally, we
should have a solution for both of them. Have you looked at how it handles
an increase of the number of partitions?

As a side note, we are in the middle or rewriting the group coordinator. I
think that big changes should only be made when we are done with that.

Best,
David

On Wed, Apr 5, 2023 at 10:08 PM Divij Vaidya <divijvaidy...@gmail.com>
wrote:

> Thank you for your comments and participation in the discussion, David,
> Justine and Alex.
>
> You are right! The KIP is missing a lot of details about the motivation. I
> apologize for the confusion I created with my earlier statement about
> reducing the downtime in this thread. I will request Christo to update it.
>
> Meanwhile, as a summary, the KIP does not attempt to solve the problem of
> losing consumer offsets after partition increase. Instead the objective of
> the KIP is to reduce the time to recovery for reads to start after such an
> event has occurred. Prior to this KIP, impact of the change manifests when
> one of the brokers is restarted and the consumer groups remain in
> errors/undefined state until all brokers have been finished restarting.
> During a rolling restart, this places the time to recovery in proportion
> with the number of brokers in the clusters. After this KIP is implemented,
> we would not wait for the broker restart to pick up the new partitions,
> instead all brokers will notified about the change in number of partitions
> immediately. This would reduce the duration during which consumer groups
> are in erroring/undefined state from length of rolling to time it takes to
> process LISR across the cluster. Hence, a (small) win!
>
> I hope this explanation throws some more light into the context.
>
> Why do users change __consumer_offets?
> 1. They change it accidentally OR
> 2. They increase it to scale with the increase in the number of consumers.
> This is because (correct me if I am wrong) with an increase in the number
> of consumers, we can hit the limits on single partition throughput while
> reading/writing to the __consumer_offsets. This is a genuine use case and
> the downside of losing existing metadata/offsets is acceptable to them.
>
> How do we ideally fix it?
> An ideal solution would allow us to increase the number of partitions for
> __consumer_offsets without losing existing metadata. We either need to make
> partition assignment for a consumer "sticky" such that existing consumers
> are not re-assigned to new partitions OR we need to transfer data as per
> new partitions in __consumer_offsets. Both these approaches are long term
> fixes and require a separate discussion.
>
> What can we do in the short term?
> In the short term either we can block users from changing the number of
> partitions (which might not be possible due to use case #2 above) OR we can
> at least improve (not fix but just improve!) the current situation by
> reducing the time to recovery using this KIP.
>
> Let's circle back on this discussion as soon as KIP is updated with more
> details.
>
> --
> Divij Vaidya
>
>
>
> On Tue, Apr 4, 2023 at 8:00 PM Alexandre Dupriez <
> alexandre.dupr...@gmail.com> wrote:
>
> > Hi Christo,
> >
> > Thanks for the KIP. Apologies for the delayed review.
> >
> > At a high-level, I am not sure if the KIP really solves the problem it
> > intends to.
> >
> > More specifically, the KIP mentions that once a broker is restarted
> > and the group coordinator becomes aware of the new partition count of
> > the consumer offsets topic, the problem is mitigated. However, how do
> > we access the metadata and offsets recorded in a partition once it is
> > no longer the partition a consumer group resolves to?
> >
> > Thanks,
> > Alexandre
> >
> > Le mar. 4 avr. 2023 à 18:34, Justine Olshan
> > <jols...@confluent.io.invalid> a écrit :
> > >
> > > Hi,
> > >
> > > I'm also a bit unsure of the motivation here. Is there a need to change
> > the
> > > number of partitions for this topic?
> > >
> > > Justine
> > >
> > > On Tue, Apr 4, 2023 at 10:07 AM David Jacot <david.ja...@gmail.com>
> > wrote:
> > >
> > > > Hi,
> > > >
> > > > I am not very comfortable with the proposal of this KIP. The main
> > issue is
> > > > that changing the number of partitions means that all group metadata
> is
> > > > lost because the hashing changes. I wonder if we should just disallow
> > > > changing the number of partitions entirely. Did we consider something
> > like
> > > > this?
> > > >
> > > > Best,
> > > > David
> > > >
> > > > Le mar. 4 avr. 2023 à 17:57, Divij Vaidya <divijvaidy...@gmail.com>
> a
> > > > écrit :
> > > >
> > > > > FYI, a user faced this problem and reached out to us in the mailing
> > list
> > > > > [1]. Implementation of this KIP could have reduced the downtime for
> > these
> > > > > customers.
> > > > >
> > > > > Christo, would you like to create a JIRA and associate with the KIP
> > so
> > > > that
> > > > > we can continue to collect cases in the JIRA where users have faced
> > this
> > > > > problem?
> > > > >
> > > > > [1]
> https://lists.apache.org/thread/zoowjshvdpkh5p0p7vqjd9fq8xvkr1nd
> > > > >
> > > > > --
> > > > > Divij Vaidya
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Jan 18, 2023 at 9:52 AM Christo Lolov <
> > christolo...@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Greetings,
> > > > > >
> > > > > > I am bumping the below DISCUSSion thread for KIP-895. The KIP
> > presents
> > > > a
> > > > > > situation where consumer groups are in an undefined state until a
> > > > rolling
> > > > > > restart of a cluster is performed. While I have demonstrated the
> > > > > behaviour
> > > > > > using a cluster using Zookeeper I believe the same problem can be
> > shown
> > > > > in
> > > > > > a KRaft cluster. Please let me know your opinions on the problem
> > and
> > > > the
> > > > > > presented solution.
> > > > > >
> > > > > > Best,
> > > > > > Christo
> > > > > >
> > > > > > On Thursday, 29 December 2022 at 14:19:27 GMT, Christo
> > > > > > > <christo_lo...@yahoo.com.invalid> wrote:
> > > > > > >
> > > > > > >
> > > > > > > Hello!
> > > > > > > I would like to start this discussion thread on KIP-895:
> > Dynamically
> > > > > > > refresh partition count of __consumer_offsets.
> > > > > > > The KIP proposes to alter brokers so that they refresh the
> > partition
> > > > > > count
> > > > > > > of __consumer_offsets used to determine group coordinators
> > without
> > > > > > > requiring a rolling restart of the cluster.
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-895%3A+Dynamically+refresh+partition+count+of+__consumer_offsets
> > > > > > >
> > > > > > > Let me know your thoughts on the matter!
> > > > > > > Best, Christo
> > > > > > >
> > > > > >
> > > > >
> > > >
> >
>

Reply via email to