Hey All!

Thanks for the KIP, it's wonderful to see cooperative rebalancing making it
down the stack!

I had a few questions:

1. The 'Rejected Alternatives' section describes how member epoch should
advance in step with the group epoch and assignment epoch values. I think
that this is a good idea for the reasons described in the KIP. When the
protocol is incrementally assigning partitions to a worker, what member
epoch does each incremental assignment use? Are member epochs re-used, and
a single member epoch can correspond to multiple different (monotonically
larger) assignments?

2. Is the Assignor's 'Reason' field opaque to the group coordinator? If
not, should custom client-side assignor implementations interact with the
Reason field, and how is its common meaning agreed upon? If so, what is the
benefit of a distinct Reason field over including such functionality in the
opaque metadata?

3. The following is included in the KIP: "Thanks to this, the input of the
client side assignor is entirely driven by the group coordinator. The
consumer is no longer responsible for maintaining any state besides its
assigned partitions." Does this mean that the client-side assignor MAY
incorporate additional non-Metadata state (such as partition throughput,
cpu/memory metrics, config topics, etc), or that additional non-Metadata
state SHOULD NOT be used?

4. I see that there are separate classes
for org.apache.kafka.server.group.consumer.PartitionAssignor
and org.apache.kafka.clients.consumer.PartitionAssignor that seem to
overlap significantly. Is it possible for these two implementations to be
unified? This would serve to promote feature parity of server-side and
client-side assignors, and would also facilitate operational flexibility in
certain situations. For example, if a server-side assignor has some poor
behavior and needs a patch, deploying the patched assignor to the client
and switching one consumer group to a client-side assignor may be faster
and less risky than patching all of the brokers. With the currently
proposed distinct APIs, a non-trivial reimplementation would have to be
assembled, and if the two APIs have diverged significantly, then it is
possible that a reimplementation would not be possible.

--
Greg Harris
gharris1...@gmail.com
github.com/gharris1727

On Wed, Aug 3, 2022 at 8:39 AM Sagar <sagarmeansoc...@gmail.com> wrote:

> Hi Guozhang/David,
>
> I created a confluence page to discuss how Connect would need to change
> based on the new rebalance protocol. Here's the page:
>
> https://cwiki.apache.org/confluence/display/KAFKA/%5BDRAFT%5DIntegrating+Kafka+Connect+With+New+Consumer+Rebalance+Protocol
>
> It's also pretty longish and I have tried to keep a format similar to
> KIP-848. Let me know what you think. Also, do you think this should be
> moved to a separate discussion thread or is this one fine?
>
> Thanks!
> Sagar.
>
> On Tue, Jul 26, 2022 at 7:37 AM Sagar <sagarmeansoc...@gmail.com> wrote:
>
> > Hello Guozhang,
> >
> > Thank you so much for the doc on Kafka Streams. Sure, I would do the
> > analysis and come up with such a document.
> >
> > Thanks!
> > Sagar.
> >
> > On Tue, Jul 26, 2022 at 4:47 AM Guozhang Wang <wangg...@gmail.com>
> wrote:
> >
> >> Hello Sagar,
> >>
> >> It would be great if you could come back with some analysis on how to
> >> implement the Connect side integration with the new protocol; so far
> >> besides leveraging on the new "protocol type" we did not yet think
> through
> >> the Connect side implementations. For Streams here's a draft of
> >> integration
> >> plan:
> >>
> >>
> https://docs.google.com/document/d/17PNz2sGoIvGyIzz8vLyJTJTU2rqnD_D9uHJnH9XARjU/edit#heading=h.pdgirmi57dvn
> >> just FYI for your analysis on Connect.
> >>
> >> On Tue, Jul 19, 2022 at 10:48 PM Sagar <sagarmeansoc...@gmail.com>
> wrote:
> >>
> >> > Hi David,
> >> >
> >> > Thank you for your response. The reason I thought connect can also fit
> >> into
> >> > this new scheme is that even today the connect uses a
> WorkerCoordinator
> >> > extending from AbstractCoordinator to empower rebalances of
> >> > tasks/connectors. The WorkerCoordinator sets the protocolType() to
> >> connect
> >> > and uses the metadata() method by plumbing into
> >> JoinGroupRequestProtocol.
> >> >
> >> > I think the changes to support connect would be similar at a high
> level
> >> to
> >> > the changes in streams mainly because of the Client side assignors
> being
> >> > used in both. At an implementation level, we might need to make a lot
> of
> >> > changes to get onto this new assignment protocol like enhancing the
> >> > JoinGroup request/response and SyncGroup and using
> >> ConsumerGroupHeartbeat
> >> > API etc again on similar lines to streams (or there might be
> >> deviations). I
> >> > would try to perform a detailed analysis of the same  and we can have
> a
> >> > separate discussion thread for that as that would derail this
> discussion
> >> > thread. Let me know if that sounds good to you.
> >> >
> >> > Thanks!
> >> > Sagar.
> >> >
> >> >
> >> >
> >> > On Fri, Jul 15, 2022 at 5:47 PM David Jacot
> <dja...@confluent.io.invalid
> >> >
> >> > wrote:
> >> >
> >> > > Hi Sagar,
> >> > >
> >> > > Thanks for your comments.
> >> > >
> >> > > 1) Yes. That refers to `Assignment#error`. Sure, I can mention it.
> >> > >
> >> > > 2) The idea is to transition C from his current assignment to his
> >> > > target assignment when he can move to epoch 3. When that happens,
> the
> >> > > member assignment is updated and persisted with all its assigned
> >> > > partitions even if they are not all revoked yet. In other words, the
> >> > > member assignment becomes the target assignment. This is basically
> an
> >> > > optimization to avoid having to write all the changes to the log.
> The
> >> > > examples are based on the persisted state so I understand the
> >> > > confusion. Let me see if I can improve this in the description.
> >> > >
> >> > > 3) Regarding Connect, it could reuse the protocol with a client side
> >> > > assignor if it fits in the protocol. The assignment is about
> >> > > topicid-partitions + metadata, could Connect fit into this?
> >> > >
> >> > > Best,
> >> > > David
> >> > >
> >> > > On Fri, Jul 15, 2022 at 1:55 PM Sagar <sagarmeansoc...@gmail.com>
> >> wrote:
> >> > > >
> >> > > > Hi David,
> >> > > >
> >> > > > Thanks for the KIP. I just had minor observations:
> >> > > >
> >> > > > 1) In the Assignment Error section in Client Side mode Assignment
> >> > > process,
> >> > > > you mentioned => `In this case, the client side assignor can
> return
> >> an
> >> > > > error to the group coordinator`. In this case are you referring to
> >> the
> >> > > > Assignor returning an AssignmentError that's listed down towards
> the
> >> > end?
> >> > > > If yes, do you think it would make sense to mention this
> explicitly
> >> > here?
> >> > > >
> >> > > > 2) In the Case Studies section, I have a slight confusion, not
> sure
> >> if
> >> > > > others have the same. Consider this step:
> >> > > >
> >> > > > When B heartbeats, the group coordinator transitions him to epoch
> 3
> >> > > because
> >> > > > B has no partitions to revoke. It persists the change and reply.
> >> > > >
> >> > > >    - Group (epoch=3)
> >> > > >       - A
> >> > > >       - B
> >> > > >       - C
> >> > > >    - Target Assignment (epoch=3)
> >> > > >       - A - partitions=[foo-0]
> >> > > >       - B - partitions=[foo-2]
> >> > > >       - C - partitions=[foo-1]
> >> > > >    - Member Assignment
> >> > > >       - A - epoch=2, partitions=[foo-0, foo-1]
> >> > > >       - B - epoch=3, partitions=[foo-2]
> >> > > >       - C - epoch=3, partitions=[foo-1]
> >> > > >
> >> > > > When C heartbeats, it transitions to epoch 3 but cannot get foo-1
> >> yet.
> >> > > >
> >> > > > Here,it's mentioned that member C can't get the foo-1 partition
> yet,
> >> > but
> >> > > > based on the description above, it seems it already has it. Do you
> >> > think
> >> > > it
> >> > > > would be better to remove it and populate it only when it actually
> >> gets
> >> > > it?
> >> > > > I see this in a lot of other places, so have I understood it
> >> > incorrectly
> >> > > ?
> >> > > >
> >> > > >
> >> > > > Regarding connect , it might be out of scope of this discussion,
> but
> >> > from
> >> > > > what I understood it would probably be running in client side
> >> assignor
> >> > > mode
> >> > > > even on the new rebalance protocol as it has its own Custom
> >> > > Assignors(Eager
> >> > > > and IncrementalCooperative).
> >> > > >
> >> > > > Thanks!
> >> > > >
> >> > > > Sagar.
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > >
> >> > > > On Fri, Jul 15, 2022 at 5:00 PM David Jacot
> >> > <dja...@confluent.io.invalid
> >> > > >
> >> > > > wrote:
> >> > > >
> >> > > > > Thanks Hector! Our goal is to move forward with specialized API
> >> > > > > instead of relying on one generic API. For Connect, we can apply
> >> the
> >> > > > > exact same pattern and reuse/share the core implementation on
> the
> >> > > > > server side. For the schema registry, I think that we should
> >> consider
> >> > > > > having a tailored API to do simple membership/leader election.
> >> > > > >
> >> > > > > Best,
> >> > > > > David
> >> > > > >
> >> > > > > On Fri, Jul 15, 2022 at 10:22 AM Ismael Juma <ism...@juma.me.uk
> >
> >> > > wrote:
> >> > > > > >
> >> > > > > > Three quick comments:
> >> > > > > >
> >> > > > > > 1. Regarding java.util.regex.Pattern vs
> >> com.google.re2j.Pattern, we
> >> > > > > should
> >> > > > > > document the differences in more detail before deciding one
> way
> >> or
> >> > > > > another.
> >> > > > > > That said, if people pass java.util.regex.Pattern, they expect
> >> > their
> >> > > > > > semantics to be honored. If we are doing something different,
> >> then
> >> > we
> >> > > > > > should consider adding an overload with our own Pattern class
> (I
> >> > > don't
> >> > > > > > think we'd want to expose re2j's at this point).
> >> > > > > > 2. Regarding topic ids, any major new protocol should
> integrate
> >> > fully
> >> > > > > with
> >> > > > > > it and should handle the topic recreation case correctly.
> That's
> >> > the
> >> > > main
> >> > > > > > part we need to handle. I agree with David that we'd want to
> add
> >> > > topic
> >> > > > > ids
> >> > > > > > to the relevant protocols that don't have it yet and that we
> can
> >> > > probably
> >> > > > > > focus on the internals versus adding new APIs to the Java
> >> Consumer
> >> > > > > (unless
> >> > > > > > we find that adding new APIs is required for reasonable
> >> semantics).
> >> > > > > > 3. I am still not sure about the coordinator storing the
> >> configs.
> >> > > It's
> >> > > > > > powerful for configs to be centralized in the metadata log for
> >> > > various
> >> > > > > > reasons (auditability, visibility, consistency, etc.).
> >> Similarly, I
> >> > > am
> >> > > > > not
> >> > > > > > sure about automatically deleting configs in a way that they
> >> cannot
> >> > > be
> >> > > > > > recovered. A good property for modern systems is to minimize
> the
> >> > > number
> >> > > > > of
> >> > > > > > unrecoverable data loss scenarios.
> >> > > > > >
> >> > > > > > Ismael
> >> > > > > >
> >> > > > > > On Wed, Jul 13, 2022 at 3:47 PM David Jacot
> >> > > <dja...@confluent.io.invalid
> >> > > > > >
> >> > > > > > wrote:
> >> > > > > >
> >> > > > > > > Thanks Guozhang. My answers are below:
> >> > > > > > >
> >> > > > > > > > 1) the migration path, especially the last step when
> clients
> >> > > flip the
> >> > > > > > > flag
> >> > > > > > > > to enable the new protocol, in which we would have a
> window
> >> > where
> >> > > > > both
> >> > > > > > > new
> >> > > > > > > > protocols / rpcs and old protocols / rpcs are used by
> >> members
> >> > of
> >> > > the
> >> > > > > same
> >> > > > > > > > group. How the coordinator could "mimic" the old behavior
> >> while
> >> > > > > using the
> >> > > > > > > > new protocol is something we need to present about.
> >> > > > > > >
> >> > > > > > > Noted. I just published a new version of KIP which includes
> >> more
> >> > > > > > > details about this. See the "Supporting Online Consumer
> Group
> >> > > Upgrade"
> >> > > > > > > and the "Compatibility, Deprecation, and Migration Plan". I
> >> think
> >> > > that
> >> > > > > > > I have to think through a few cases now but the overall idea
> >> and
> >> > > > > > > mechanism should be understandable.
> >> > > > > > >
> >> > > > > > > > 2) the usage of topic ids. So far as KIP-516 the topic ids
> >> are
> >> > > only
> >> > > > > used
> >> > > > > > > as
> >> > > > > > > > part of RPCs and admin client, but they are not exposed
> via
> >> any
> >> > > > > public
> >> > > > > > > APIs
> >> > > > > > > > to consumers yet. I think the question is, first should we
> >> let
> >> > > the
> >> > > > > > > consumer
> >> > > > > > > > client to be maintaining the names -> ids mapping itself
> to
> >> > fully
> >> > > > > > > leverage
> >> > > > > > > > on all the augmented existing RPCs and the new RPCs with
> the
> >> > > topic
> >> > > > > ids;
> >> > > > > > > and
> >> > > > > > > > secondly, should we ever consider exposing the topic ids
> in
> >> the
> >> > > > > consumer
> >> > > > > > > > public APIs as well (both subscribe/assign, as well as in
> >> the
> >> > > > > rebalance
> >> > > > > > > > listener for cases like topic deletion-and-recreation).
> >> > > > > > >
> >> > > > > > > a) Assuming that we would include converting all the offsets
> >> > > related
> >> > > > > > > RPCs to using topic ids in this KIP, the consumer would be
> >> able
> >> > to
> >> > > > > > > fully operate with topic ids. That being said, it still has
> to
> >> > > provide
> >> > > > > > > the topics names in various APIs so having a mapping in the
> >> > > consumer
> >> > > > > > > seems inevitable to me.
> >> > > > > > > b) I don't have a strong opinion on this. Here I wonder if
> >> this
> >> > > goes
> >> > > > > > > beyond the scope of this KIP. I would rather focus on the
> >> > internals
> >> > > > > > > here and we can consider this separately if we see value in
> >> doing
> >> > > it.
> >> > > > > > >
> >> > > > > > > Coming back to Ismael's point about using topic ids in the
> >> > > > > > > ConsumerGroupHeartbeatRequest, I think that there is one
> >> > advantage
> >> > > in
> >> > > > > > > favour of it. The consumer will have the opportunity to
> >> validate
> >> > > that
> >> > > > > > > the topics exists before passing them into the group
> rebalance
> >> > > > > > > protocol. Obviously, the coordinator will also notice it but
> >> it
> >> > > does
> >> > > > > > > not really have a way to reject an invalid topic in the
> >> response.
> >> > > > > > >
> >> > > > > > > > I'm agreeing with David on all other minor questions
> except
> >> for
> >> > > the
> >> > > > > > > > `subscribe(Pattern)` question: personally I think it's not
> >> > > necessary
> >> > > > > to
> >> > > > > > > > deprecate the subscribe API with Pattern, but instead we
> >> still
> >> > > use
> >> > > > > > > Pattern
> >> > > > > > > > while just documenting that our subscription may be
> >> rejected by
> >> > > the
> >> > > > > > > server.
> >> > > > > > > > Since the incompatible case is a very rare scenario I felt
> >> > using
> >> > > an
> >> > > > > > > > overloaded `String` based subscription may be more
> >> vulnerable
> >> > to
> >> > > > > various
> >> > > > > > > > invalid regexes.
> >> > > > > > >
> >> > > > > > > That could work. I have to look at the differences between
> the
> >> > two
> >> > > > > > > engines to better understand the potential issues. My
> >> > > understanding is
> >> > > > > > > that would work for all the basic regular expressions. The
> >> > > differences
> >> > > > > > > between the two are mainly about the various character
> >> classes. I
> >> > > > > > > wonder what other people think about this.
> >> > > > > > >
> >> > > > > > > Best,
> >> > > > > > > David
> >> > > > > > >
> >> > > > > > > On Tue, Jul 12, 2022 at 11:28 PM Guozhang Wang <
> >> > wangg...@gmail.com
> >> > > >
> >> > > > > wrote:
> >> > > > > > > >
> >> > > > > > > > Thanks David! I think on the high level there are two meta
> >> > > points we
> >> > > > > need
> >> > > > > > > > to concretize a bit more:
> >> > > > > > > >
> >> > > > > > > > 1) the migration path, especially the last step when
> clients
> >> > > flip the
> >> > > > > > > flag
> >> > > > > > > > to enable the new protocol, in which we would have a
> window
> >> > where
> >> > > > > both
> >> > > > > > > new
> >> > > > > > > > protocols / rpcs and old protocols / rpcs are used by
> >> members
> >> > of
> >> > > the
> >> > > > > same
> >> > > > > > > > group. How the coordinator could "mimic" the old behavior
> >> while
> >> > > > > using the
> >> > > > > > > > new protocol is something we need to present about.
> >> > > > > > > > 2) the usage of topic ids. So far as KIP-516 the topic ids
> >> are
> >> > > only
> >> > > > > used
> >> > > > > > > as
> >> > > > > > > > part of RPCs and admin client, but they are not exposed
> via
> >> any
> >> > > > > public
> >> > > > > > > APIs
> >> > > > > > > > to consumers yet. I think the question is, first should we
> >> let
> >> > > the
> >> > > > > > > consumer
> >> > > > > > > > client to be maintaining the names -> ids mapping itself
> to
> >> > fully
> >> > > > > > > leverage
> >> > > > > > > > on all the augmented existing RPCs and the new RPCs with
> the
> >> > > topic
> >> > > > > ids;
> >> > > > > > > and
> >> > > > > > > > secondly, should we ever consider exposing the topic ids
> in
> >> the
> >> > > > > consumer
> >> > > > > > > > public APIs as well (both subscribe/assign, as well as in
> >> the
> >> > > > > rebalance
> >> > > > > > > > listener for cases like topic deletion-and-recreation).
> >> > > > > > > >
> >> > > > > > > > I'm agreeing with David on all other minor questions
> except
> >> for
> >> > > the
> >> > > > > > > > `subscribe(Pattern)` question: personally I think it's not
> >> > > necessary
> >> > > > > to
> >> > > > > > > > deprecate the subscribe API with Pattern, but instead we
> >> still
> >> > > use
> >> > > > > > > Pattern
> >> > > > > > > > while just documenting that our subscription may be
> >> rejected by
> >> > > the
> >> > > > > > > server.
> >> > > > > > > > Since the incompatible case is a very rare scenario I felt
> >> > using
> >> > > an
> >> > > > > > > > overloaded `String` based subscription may be more
> >> vulnerable
> >> > to
> >> > > > > various
> >> > > > > > > > invalid regexes.
> >> > > > > > > >
> >> > > > > > > >
> >> > > > > > > > Guozhang
> >> > > > > > > >
> >> > > > > > > > On Tue, Jul 12, 2022 at 5:23 AM David Jacot
> >> > > > > <dja...@confluent.io.invalid
> >> > > > > > > >
> >> > > > > > > > wrote:
> >> > > > > > > >
> >> > > > > > > > > Hi Ismael,
> >> > > > > > > > >
> >> > > > > > > > > Thanks for your feedback. Let me answer your questions
> >> > inline.
> >> > > > > > > > >
> >> > > > > > > > > > 1. I think it's premature to talk about target
> versions
> >> for
> >> > > > > > > deprecation
> >> > > > > > > > > and
> >> > > > > > > > > > removal of the existing group protocol. Unlike KRaft,
> >> this
> >> > > > > affects a
> >> > > > > > > core
> >> > > > > > > > > > client protocol and hence deprecation/removal will be
> >> > heavily
> >> > > > > > > dependent
> >> > > > > > > > > on
> >> > > > > > > > > > how quickly applications migrate to the new protocol.
> >> > > > > > > > >
> >> > > > > > > > > That makes sense. I will remove it.
> >> > > > > > > > >
> >> > > > > > > > > > 2. The KIP says we intend to release this in 4.x, but
> it
> >> > > wasn't
> >> > > > > made
> >> > > > > > > > > clear
> >> > > > > > > > > > why. If we added that as a way to estimate when we'd
> >> > > deprecate
> >> > > > > and
> >> > > > > > > remove
> >> > > > > > > > > > the group protocol, I also suggest removing this part.
> >> > > > > > > > >
> >> > > > > > > > > Let me explain my reasoning. As explained, I plan to
> >> rewrite
> >> > > the
> >> > > > > group
> >> > > > > > > > > coordinator in Java while we implement the new protocol.
> >> This
> >> > > means
> >> > > > > > > > > that the internals will be slightly different (e.g.
> >> threading
> >> > > > > model).
> >> > > > > > > > > Therefore, I wanted to tighten the switch from the old
> >> group
> >> > > > > > > > > coordinator to the new group coordinator to a major
> >> release.
> >> > > The
> >> > > > > > > > > alternative would be to use a flag to do the switch
> >> instead
> >> > of
> >> > > > > relying
> >> > > > > > > > > on the software upgrade.
> >> > > > > > > > >
> >> > > > > > > > > > 3. We need to flesh out the details of the migration
> >> story.
> >> > > It
> >> > > > > sounds
> >> > > > > > > > > like
> >> > > > > > > > > > we're saying we will support online migrations. Is
> that
> >> > > correct?
> >> > > > > We
> >> > > > > > > > > should
> >> > > > > > > > > > explain this in detail. It could also be done as a
> >> separate
> >> > > KIP,
> >> > > > > if
> >> > > > > > > it's
> >> > > > > > > > > > easier.
> >> > > > > > > > >
> >> > > > > > > > > Yes, we will support online migrations for the group.
> That
> >> > > means
> >> > > > > that
> >> > > > > > > > > a group using the old protocol will be able to switch to
> >> the
> >> > > new
> >> > > > > > > > > protocol.
> >> > > > > > > > >
> >> > > > > > > > > Let me briefly explain how that will work though. It is
> >> > > basically a
> >> > > > > > > > > four step process:
> >> > > > > > > > >
> >> > > > > > > > > 1. The cluster must be upgraded or rolled to a software
> >> > > supporting
> >> > > > > the
> >> > > > > > > > > new group coordinator. Both the old and the new
> >> coordinator
> >> > > will
> >> > > > > > > > > support the old protocol and rely on the same persisted
> >> > > metadata so
> >> > > > > > > > > they can work together. This point is an offline
> >> migration.
> >> > We
> >> > > > > cannot
> >> > > > > > > > > do this one live because it would require shutting down
> >> the
> >> > > current
> >> > > > > > > > > coordinator and starting up the new one and that would
> >> cause
> >> > > > > > > > > unavailabilities.
> >> > > > > > > > > 2. The cluster's metadata version/IBP must be upgraded
> to
> >> X
> >> > in
> >> > > > > order
> >> > > > > > > > > to enable the new protocol. This cannot be done before
> 1)
> >> is
> >> > > > > > > > > terminated because the old coordinator doesn't support
> the
> >> > new
> >> > > > > > > > > protocol.
> >> > > > > > > > > 3. The consumers must be upgraded to a version
> supporting
> >> the
> >> > > > > online
> >> > > > > > > > > migration (must have KIP-792). If the consumer is
> already
> >> > > there.
> >> > > > > > > > > Nothing must be done at this point.
> >> > > > > > > > > 4. The consumers must be rolled with the feature flag
> >> turned
> >> > > on.
> >> > > > > The
> >> > > > > > > > > consumer group is automatically converted when the first
> >> > > consumer
> >> > > > > > > > > using the new protocol joins the group. While the
> members
> >> > > using the
> >> > > > > > > > > old protocol are being upgraded, the old protocol is
> >> proxied
> >> > > into
> >> > > > > the
> >> > > > > > > > > new one.
> >> > > > > > > > >
> >> > > > > > > > > Let me clarify all of this in the KIP.
> >> > > > > > > > >
> >> > > > > > > > > > 4. I am happy that we are pushing the pattern
> >> subscriptions
> >> > > to
> >> > > > > the
> >> > > > > > > > > server,
> >> > > > > > > > > > but it seems like there could be some tricky
> >> compatibility
> >> > > > > issues.
> >> > > > > > > Will
> >> > > > > > > > > we
> >> > > > > > > > > > have a mechanism for users to detect that they need to
> >> > update
> >> > > > > their
> >> > > > > > > regex
> >> > > > > > > > > > before switching to the new protocol?
> >> > > > > > > > >
> >> > > > > > > > > I think that I am a bit more optimistic than you on this
> >> > > point. I
> >> > > > > > > > > believe that the majority of the cases are simple
> regexes
> >> > which
> >> > > > > should
> >> > > > > > > > > work with the new engine. The coordinator will verify
> the
> >> > regex
> >> > > > > anyway
> >> > > > > > > > > and reject the consumer if the regex is not valid.
> Coming
> >> > back
> >> > > to
> >> > > > > the
> >> > > > > > > > > migration path, in the worst case, the first upgraded
> >> > consumer
> >> > > > > joining
> >> > > > > > > > > the group will be rejected. This should be used as the
> >> last
> >> > > > > defence, I
> >> > > > > > > > > would say.
> >> > > > > > > > >
> >> > > > > > > > > One way for customers to validate their regex before
> >> > upgrading
> >> > > > > their
> >> > > > > > > > > prod would be to test them with another group. For
> >> instance,
> >> > > that
> >> > > > > > > > > could be done in a pre-prod environment. Another way
> >> would be
> >> > > to
> >> > > > > > > > > extend the consumer-group tool to provide a regex
> >> validation
> >> > > > > > > > > mechanism. Would this be enough in your opinion?
> >> > > > > > > > >
> >> > > > > > > > > > 5. Related to the last question, will the Java client
> >> allow
> >> > > the
> >> > > > > > > users to
> >> > > > > > > > > > stick with the current regex engine for compatibility
> >> > > reasons?
> >> > > > > For
> >> > > > > > > > > example,
> >> > > > > > > > > > it may be handy to keep using client based regex at
> >> first
> >> > to
> >> > > keep
> >> > > > > > > > > > migrations simple and then migrate to server based
> >> regexes
> >> > > as a
> >> > > > > > > second
> >> > > > > > > > > step.
> >> > > > > > > > >
> >> > > > > > > > > I understand your point but I am concerned that this
> would
> >> > > allow
> >> > > > > users
> >> > > > > > > > > to actually stay in this mode. That would go against our
> >> goal
> >> > > of
> >> > > > > > > > > simplifying the client because we would have to continue
> >> > > monitoring
> >> > > > > > > > > the metadata on the client side. I would rather not do
> >> this.
> >> > > > > > > > >
> >> > > > > > > > > > 6. When we say that the group coordinator will be
> >> > > responsible for
> >> > > > > > > storing
> >> > > > > > > > > > the configurations and that the configurations will be
> >> > > deleted
> >> > > > > when
> >> > > > > > > the
> >> > > > > > > > > > group is deleted. Will a transition to DEAD trigger
> >> > deletion
> >> > > of
> >> > > > > > > > > > configurations?
> >> > > > > > > > >
> >> > > > > > > > > That's right. The configurations will be deleted when
> the
> >> > > group is
> >> > > > > > > > > deleted. They go together.
> >> > > > > > > > >
> >> > > > > > > > > > 7. Will the choice to store the configs in the group
> >> > > coordinator
> >> > > > > > > make it
> >> > > > > > > > > > harder to list all cluster configs and their values?
> >> > > > > > > > >
> >> > > > > > > > > I don't think so. The group configurations are overrides
> >> of
> >> > > cluster
> >> > > > > > > > > configs. If you want to know all the overrides though,
> you
> >> > > would
> >> > > > > have
> >> > > > > > > > > to ask all the group coordinators. You cannot rely on
> the
> >> > > metadata
> >> > > > > log
> >> > > > > > > > > for instance.
> >> > > > > > > > >
> >> > > > > > > > > > 8. How would someone configure a group before starting
> >> the
> >> > > > > consumers?
> >> > > > > > > > > Have
> >> > > > > > > > > > we considered allowing the explicit creation of
> groups?
> >> > > > > > > Alternatively,
> >> > > > > > > > > the
> >> > > > > > > > > > configs could be decoupled from the group lifecycle.
> >> > > > > > > > >
> >> > > > > > > > > Yes. The group will be automatically created in this
> case.
> >> > > However,
> >> > > > > > > > > the configs will be lost after the retention period of
> the
> >> > > group
> >> > > > > > > > > passes.
> >> > > > > > > > >
> >> > > > > > > > > > 9. Will the Consumer.subscribe method for the Java
> >> client
> >> > > still
> >> > > > > take
> >> > > > > > > a
> >> > > > > > > > > > `java.util.regex.Pattern` of do we have to introduce
> an
> >> > > overload?
> >> > > > > > > > >
> >> > > > > > > > > That's a very group question. I forgot about that one.
> As
> >> the
> >> > > > > > > > > `java.util.regex.Pattern` is not fully compatible with
> the
> >> > > engine
> >> > > > > that
> >> > > > > > > > > we plan to use, it might be better to deprecate it and
> >> use an
> >> > > > > overload
> >> > > > > > > > > which takes a string. We would rely on the server side
> >> > > validation.
> >> > > > > > > > > During the migration, I think that we could still try to
> >> > > toString
> >> > > > > the
> >> > > > > > > > > regex and use it. That should work, I think, in the
> >> majority
> >> > > of the
> >> > > > > > > > > cases.
> >> > > > > > > > >
> >> > > > > > > > > > 10. I agree with Justine that we should be clearer
> about
> >> > the
> >> > > > > reason
> >> > > > > > > to
> >> > > > > > > > > > switch to IBP/metadata.version from the feature flag.
> >> Maybe
> >> > > we
> >> > > > > mean
> >> > > > > > > that
> >> > > > > > > > > we
> >> > > > > > > > > > can switch the default for the feature flag to true
> >> based
> >> > on
> >> > > the
> >> > > > > > > > > > metadata.version once we want to make it the default.
> >> > > > > > > > >
> >> > > > > > > > > My plan was to use that feature flag mainly during the
> >> > > development
> >> > > > > > > > > phase. I should not have mentioned it, I think, because
> we
> >> > > could
> >> > > > > use
> >> > > > > > > > > an internal config for it.
> >> > > > > > > > >
> >> > > > > > > > > > 11. Some of the protocol APIs don't mention the
> required
> >> > > ACLs, it
> >> > > > > > > would
> >> > > > > > > > > be
> >> > > > > > > > > > good to add that for consistency.
> >> > > > > > > > >
> >> > > > > > > > > Noted.
> >> > > > > > > > >
> >> > > > > > > > > > 12. It is a bit odd that ConsumerGroupHeartbeat
> requires
> >> > > "Read
> >> > > > > Group"
> >> > > > > > > > > even
> >> > > > > > > > > > though it seems to do more than reading.
> >> > > > > > > > >
> >> > > > > > > > > I agree. This is how the current protocol works though.
> We
> >> > only
> >> > > > > > > > > require "Read Group" to join a group. We could consider
> >> > > changing
> >> > > > > this
> >> > > > > > > > > but I am not sure that it is worth it.
> >> > > > > > > > >
> >> > > > > > > > > > 13. How is topic recreation handled by the consumer
> with
> >> > the
> >> > > new
> >> > > > > > > group
> >> > > > > > > > > > protocol? It would be good to have a section on this.
> >> > > > > > > > >
> >> > > > > > > > > Noted. From a protocol perspective, the new topic will
> >> have a
> >> > > new
> >> > > > > > > > > topic id so it will treat it like a topic with a
> different
> >> > > name.
> >> > > > > The
> >> > > > > > > > > only issue is that the fetch/commit offsets APIs do not
> >> > support
> >> > > > > topic
> >> > > > > > > > > IDs so the consumer would reuse the offsets based on the
> >> > same.
> >> > > I
> >> > > > > think
> >> > > > > > > > > that we should update those APIs as well in order to be
> >> > > consistent
> >> > > > > end
> >> > > > > > > > > to end. That would strengthen the semantics of the
> >> consumer.
> >> > > > > > > > >
> >> > > > > > > > > > 14. The KIP mentions we will write the new coordinator
> >> in
> >> > > Java.
> >> > > > > Even
> >> > > > > > > > > though
> >> > > > > > > > > > this is an implementation detail, do we plan to have a
> >> new
> >> > > gradle
> >> > > > > > > module
> >> > > > > > > > > > for it?
> >> > > > > > > > >
> >> > > > > > > > > Yes.
> >> > > > > > > > >
> >> > > > > > > > > > 15. Do we have a scalability goal when it comes to how
> >> many
> >> > > > > members
> >> > > > > > > the
> >> > > > > > > > > new
> >> > > > > > > > > > group protocol can support?
> >> > > > > > > > >
> >> > > > > > > > > We don't have numbers at the moment. The protocol should
> >> > > support
> >> > > > > 1000s
> >> > > > > > > > > of members per group. We will measure this when we have
> a
> >> > first
> >> > > > > > > > > implementation. Note that we might have other
> bottlenecks
> >> > down
> >> > > the
> >> > > > > > > > > road (e.g. offset commits).
> >> > > > > > > > >
> >> > > > > > > > > > 16. Did we consider having SubscribedTopidIds instead
> of
> >> > > > > > > > > > SubscribedTopicNames in ConsumerGroupHeartbeatRequest?
> >> Is
> >> > the
> >> > > > > idea
> >> > > > > > > that
> >> > > > > > > > > > since we have to resolve the regex on the server, we
> >> can do
> >> > > the
> >> > > > > same
> >> > > > > > > for
> >> > > > > > > > > > the topic name? The difference is that sending the
> >> regex is
> >> > > more
> >> > > > > > > > > efficient
> >> > > > > > > > > > whereas sending the topic names is less efficient.
> >> > > Furthermore,
> >> > > > > > > delete
> >> > > > > > > > > and
> >> > > > > > > > > > recreation is easier to handle if we have topic ids.
> >> > > > > > > > >
> >> > > > > > > > > The idea was to consolidate the metadata lookup on the
> >> server
> >> > > for
> >> > > > > both
> >> > > > > > > > > paths but I do agree with your point. As a second
> though,
> >> > using
> >> > > > > topic
> >> > > > > > > > > ids may be better here for the delete and recreation
> case.
> >> > > Also, I
> >> > > > > > > > > suppose that we may allow users to subscribe with topic
> >> ids
> >> > in
> >> > > the
> >> > > > > > > > > future because that is the only way to be really robust
> to
> >> > > topic
> >> > > > > > > > > re-creation.
> >> > > > > > > > >
> >> > > > > > > > > Best,
> >> > > > > > > > > David
> >> > > > > > > > >
> >> > > > > > > > > On Tue, Jul 12, 2022 at 1:38 PM David Jacot <
> >> > > dja...@confluent.io>
> >> > > > > > > wrote:
> >> > > > > > > > > >
> >> > > > > > > > > > Hi Justine,
> >> > > > > > > > > >
> >> > > > > > > > > > Thanks for your comments. Please find my answers
> below.
> >> > > > > > > > > >
> >> > > > > > > > > > - Yes, the new protocol relies on topic IDs with the
> >> > > exception
> >> > > > > of the
> >> > > > > > > > > > topic names based in the
> ConsumerGroupHeartbeatRequest.
> >> I
> >> > am
> >> > > not
> >> > > > > sure
> >> > > > > > > > > > if using topic names is the right call here. I need to
> >> > think
> >> > > > > about it
> >> > > > > > > > > > a little more. Obviously, the KIP does not change the
> >> > > > > fetch/commit
> >> > > > > > > > > > offsets RPCs to use topic IDs. This may be something
> >> that
> >> > we
> >> > > > > should
> >> > > > > > > > > > include though as it would give better overall
> >> guarantee in
> >> > > the
> >> > > > > > > > > > producer.
> >> > > > > > > > > > - You're right. I think that I should not have
> mentioned
> >> > this
> >> > > > > flag at
> >> > > > > > > > > > all. I will remove it. We can use an internal
> >> configuration
> >> > > while
> >> > > > > > > > > > developing the feature.
> >> > > > > > > > > > - Both cluster types will be supported. The change is
> >> > > > > orthogonal. The
> >> > > > > > > > > > only requirement is that the cluster uses topic IDs.
> >> > > > > > > > > >
> >> > > > > > > > > > Best,
> >> > > > > > > > > > David
> >> > > > > > > > > >
> >> > > > > > > > > > On Mon, Jul 11, 2022 at 9:53 PM Guozhang Wang <
> >> > > > > wangg...@gmail.com>
> >> > > > > > > > > wrote:
> >> > > > > > > > > > >
> >> > > > > > > > > > > Hi Ismael,
> >> > > > > > > > > > >
> >> > > > > > > > > > > Thanks for the feedback. Here are some replies
> inlined
> >> > > below:
> >> > > > > > > > > > >
> >> > > > > > > > > > > On Sat, Jul 9, 2022 at 2:53 AM Ismael Juma <
> >> > > ism...@juma.me.uk>
> >> > > > > > > wrote:
> >> > > > > > > > > > >
> >> > > > > > > > > > > > Thanks for the KIP. This has the potential to be a
> >> > great
> >> > > > > > > > > improvement. A few
> >> > > > > > > > > > > > initial questions/comments:
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > 1. I think it's premature to talk about target
> >> versions
> >> > > for
> >> > > > > > > > > deprecation and
> >> > > > > > > > > > > > removal of the existing group protocol. Unlike
> >> KRaft,
> >> > > this
> >> > > > > > > affects a
> >> > > > > > > > > core
> >> > > > > > > > > > > > client protocol and hence deprecation/removal will
> >> be
> >> > > heavily
> >> > > > > > > > > dependent on
> >> > > > > > > > > > > > how quickly applications migrate to the new
> >> protocol.
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > Yeah I agree with you. I think we can remove the
> >> proposed
> >> > > > > timeline
> >> > > > > > > in
> >> > > > > > > > > the
> >> > > > > > > > > > > `Compatibility, Deprecation, and Migration Plan` and
> >> > > instead
> >> > > > > just
> >> > > > > > > state
> >> > > > > > > > > > > that we will decide in the future about when we
> would
> >> > > > > deprecate old
> >> > > > > > > > > > > protocol and behaviors.
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > > 2. The KIP says we intend to release this in 4.x,
> >> but
> >> > it
> >> > > > > wasn't
> >> > > > > > > made
> >> > > > > > > > > clear
> >> > > > > > > > > > > > why. If we added that as a way to estimate when
> we'd
> >> > > > > deprecate
> >> > > > > > > and
> >> > > > > > > > > remove
> >> > > > > > > > > > > > the group protocol, I also suggest removing this
> >> part.
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > I think that's not specifically related to the
> >> > > > > deprecation/removal
> >> > > > > > > > > timeline
> >> > > > > > > > > > > plan, but it's more for client upgrades. I.e. the
> >> > > broker-side
> >> > > > > > > > > > > implementation may be done first, and then the
> client
> >> > side,
> >> > > > > and we
> >> > > > > > > > > would
> >> > > > > > > > > > > only mark it as "released" by the time clients
> >> > > implementations
> >> > > > > are
> >> > > > > > > > > done. At
> >> > > > > > > > > > > that time, to enable the feature the clients need to
> >> > first
> >> > > > > swap-in
> >> > > > > > > the
> >> > > > > > > > > > > bytecode with a rolling bounce and then set the flag
> >> > with a
> >> > > > > second
> >> > > > > > > > > rolling
> >> > > > > > > > > > > bounce, and hence we feel it's better to be released
> >> in a
> >> > > major
> >> > > > > > > > > version.
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > > 3. We need to flesh out the details of the
> migration
> >> > > story.
> >> > > > > It
> >> > > > > > > > > sounds like
> >> > > > > > > > > > > > we're saying we will support online migrations. Is
> >> that
> >> > > > > correct?
> >> > > > > > > We
> >> > > > > > > > > should
> >> > > > > > > > > > > > explain this in detail. It could also be done as a
> >> > > separate
> >> > > > > KIP,
> >> > > > > > > if
> >> > > > > > > > > it's
> >> > > > > > > > > > > > easier.
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > Yes I think that's the part we can be more concrete
> >> about
> >> > > for
> >> > > > > sure
> >> > > > > > > (and
> >> > > > > > > > > > > this is related to your question 2) above). We will
> >> work
> >> > on
> >> > > > > making
> >> > > > > > > it
> >> > > > > > > > > more
> >> > > > > > > > > > > explicit in parallel as we solicit more feedback.
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > > 4. I am happy that we are pushing the pattern
> >> > > subscriptions
> >> > > > > to
> >> > > > > > > the
> >> > > > > > > > > server,
> >> > > > > > > > > > > > but it seems like there could be some tricky
> >> > > compatibility
> >> > > > > > > issues.
> >> > > > > > > > > Will we
> >> > > > > > > > > > > > have a mechanism for users to detect that they
> need
> >> to
> >> > > update
> >> > > > > > > their
> >> > > > > > > > > regex
> >> > > > > > > > > > > > before switching to the new protocol?
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > Yes I think we need some tooling for non-java client
> >> > users
> >> > > to
> >> > > > > sort
> >> > > > > > > of
> >> > > > > > > > > > > "dry-run" the client before switching to the new
> >> > protocol.
> >> > > I
> >> > > > > do not
> >> > > > > > > > > have a
> >> > > > > > > > > > > specific idea on top of my head though, maybe others
> >> like
> >> > > @Matt
> >> > > > > > > > > Howlett can
> >> > > > > > > > > > > chime-in here?
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > > 5. Related to the last question, will the Java
> >> client
> >> > > allow
> >> > > > > the
> >> > > > > > > > > users to
> >> > > > > > > > > > > > stick with the current regex engine for
> >> compatibility
> >> > > > > reasons?
> >> > > > > > > For
> >> > > > > > > > > example,
> >> > > > > > > > > > > > it may be handy to keep using client based regex
> at
> >> > > first to
> >> > > > > keep
> >> > > > > > > > > > > > migrations simple and then migrate to server based
> >> > > regexes
> >> > > > > as a
> >> > > > > > > > > second
> >> > > > > > > > > > > > step.
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > Honestly I have not thought about that for java
> >> clients,
> >> > > and
> >> > > > > we can
> >> > > > > > > > > discuss
> >> > > > > > > > > > > that. What kind of compatibility issues do you have
> in
> >> > > mind?
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > > 6. When we say that the group coordinator will be
> >> > > > > responsible for
> >> > > > > > > > > storing
> >> > > > > > > > > > > > the configurations and that the configurations
> will
> >> be
> >> > > > > deleted
> >> > > > > > > when
> >> > > > > > > > > the
> >> > > > > > > > > > > > group is deleted. Will a transition to DEAD
> trigger
> >> > > deletion
> >> > > > > of
> >> > > > > > > > > > > > configurations?
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > Yes, since the DEAD state is an ending state (we
> would
> >> > only
> >> > > > > > > transit to
> >> > > > > > > > > that
> >> > > > > > > > > > > state when the group is EMPTY and also all of its
> >> > metadata
> >> > > are
> >> > > > > > > gone),
> >> > > > > > > > > once
> >> > > > > > > > > > > it's transited to DEAD this group would never be
> >> revived.
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > > 7. Will the choice to store the configs in the
> group
> >> > > > > coordinator
> >> > > > > > > > > make it
> >> > > > > > > > > > > > harder to list all cluster configs and their
> values?
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > That's a good question, and our thoughts are that
> the
> >> > > so-called
> >> > > > > > > "group
> >> > > > > > > > > > > configurations" are overrides of the cluster-level
> >> > > > > configurations
> >> > > > > > > > > > > customized per group so when an admin list cluster
> >> > configs
> >> > > it's
> >> > > > > > > okay to
> >> > > > > > > > > > > list just the cluster-level "defaults", not showing
> >> any
> >> > > > > per-group
> >> > > > > > > > > > > customizations.
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > > 8. How would someone configure a group before
> >> starting
> >> > > the
> >> > > > > > > > > consumers? Have
> >> > > > > > > > > > > > we considered allowing the explicit creation of
> >> groups?
> >> > > > > > > > > Alternatively, the
> >> > > > > > > > > > > > configs could be decoupled from the group
> lifecycle.
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > The configs can be created before the group itself
> as
> >> an
> >> > > > > > > independent
> >> > > > > > > > > entity
> >> > > > > > > > > > > --- of course, this requires the corresponding
> >> request to
> >> > > be
> >> > > > > > > routed to
> >> > > > > > > > > the
> >> > > > > > > > > > > right coordinator based on the group id --- the only
> >> > thing
> >> > > that
> >> > > > > > > > > differs is,
> >> > > > > > > > > > > when the group itself is gone we also check if there
> >> are
> >> > > any
> >> > > > > > > > > configuration
> >> > > > > > > > > > > entities related to that group and delete as well.
> >> > > > > > > > > > >
> >> > > > > > > > > > > Admittedly this indeed introduces an asymmetry on
> the
> >> > > creation
> >> > > > > /
> >> > > > > > > > > deletion
> >> > > > > > > > > > > lifecycles of the config entities, and we would like
> >> to
> >> > > hear
> >> > > > > > > everyone's
> >> > > > > > > > > > > feelings whether we should aim for symmetry i.e.
> >> totally
> >> > > > > decouple
> >> > > > > > > group
> >> > > > > > > > > > > configs and hence not delete them at all when the
> >> group
> >> > is
> >> > > > > gone,
> >> > > > > > > but
> >> > > > > > > > > always
> >> > > > > > > > > > > require explicit deletion operations by themselves.
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > > 9. Will the Consumer.subscribe method for the Java
> >> > client
> >> > > > > still
> >> > > > > > > take
> >> > > > > > > > > a
> >> > > > > > > > > > > > `java.util.regex.Pattern` of do we have to
> >> introduce an
> >> > > > > overload?
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > I think we do not need to introduce an overload, but
> >> I'm
> >> > > all
> >> > > > > ears
> >> > > > > > > if
> >> > > > > > > > > there
> >> > > > > > > > > > > may be some compatibility issues that we may
> overlook.
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > > 10. I agree with Justine that we should be clearer
> >> > about
> >> > > the
> >> > > > > > > reason
> >> > > > > > > > > to
> >> > > > > > > > > > > > switch to IBP/metadata.version from the feature
> >> flag.
> >> > > Maybe
> >> > > > > we
> >> > > > > > > mean
> >> > > > > > > > > that we
> >> > > > > > > > > > > > can switch the default for the feature flag to
> true
> >> > > based on
> >> > > > > the
> >> > > > > > > > > > > > metadata.version once we want to make it the
> >> default.
> >> > > > > > > > > > >
> >> > > > > > > > > > > 11. Some of the protocol APIs don't mention the
> >> required
> >> > > ACLs,
> >> > > > > it
> >> > > > > > > > > would be
> >> > > > > > > > > > > > good to add that for consistency.
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > Ack, we can certainly do that.
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > > 12. It is a bit odd that ConsumerGroupHeartbeat
> >> > requires
> >> > > > > "Read
> >> > > > > > > > > Group" even
> >> > > > > > > > > > > > though it seems to do more than reading.
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > I had that thought myself as well, but in the end we
> >> > could
> >> > > not
> >> > > > > > > find a
> >> > > > > > > > > > > better alternative: adding Write Group seems an
> >> overkill
> >> > > here
> >> > > > > > > since we
> >> > > > > > > > > do
> >> > > > > > > > > > > not have it elsewhere (we only have Read / Delete
> and
> >> > > Describe
> >> > > > > on
> >> > > > > > > > > groups so
> >> > > > > > > > > > > far). Would like to hear others thoughts.
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > > 13. How is topic recreation handled by the
> consumer
> >> > with
> >> > > the
> >> > > > > new
> >> > > > > > > > > group
> >> > > > > > > > > > > > protocol? It would be good to have a section on
> >> this.
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > You mean with regex subscription right? Yes we can
> >> add a
> >> > > > > section
> >> > > > > > > about
> >> > > > > > > > > > > that, but basically the idea is that consumer would
> be
> >> > > totally
> >> > > > > > > > > agnostic in
> >> > > > > > > > > > > the new protocol as it's handled all by the brokers.
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > > 14. The KIP mentions we will write the new
> >> coordinator
> >> > in
> >> > > > > Java.
> >> > > > > > > Even
> >> > > > > > > > > though
> >> > > > > > > > > > > > this is an implementation detail, do we plan to
> >> have a
> >> > > new
> >> > > > > gradle
> >> > > > > > > > > module
> >> > > > > > > > > > > > for it?
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > We have not thought about that. But I think the
> answer
> >> > > should
> >> > > > > be
> >> > > > > > > yes.
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > > 15. Do we have a scalability goal when it comes to
> >> how
> >> > > many
> >> > > > > > > members
> >> > > > > > > > > the new
> >> > > > > > > > > > > > group protocol can support?
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > Within a group, I think we should shoot for 1000s of
> >> > > members.
> >> > > > > But
> >> > > > > > > that
> >> > > > > > > > > > > scalability goals also depend on the offset
> management
> >> > > (commit,
> >> > > > > > > fetch)
> >> > > > > > > > > > > capabilities of the coordinator which we did not
> >> cover in
> >> > > this
> >> > > > > > > KIP, so
> >> > > > > > > > > it's
> >> > > > > > > > > > > hard to give a number that applies universally.
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > > 16. Did we consider having SubscribedTopidIds
> >> instead
> >> > of
> >> > > > > > > > > > > > SubscribedTopicNames in
> >> ConsumerGroupHeartbeatRequest?
> >> > > Is the
> >> > > > > > > idea
> >> > > > > > > > > that
> >> > > > > > > > > > > > since we have to resolve the regex on the server,
> we
> >> > can
> >> > > do
> >> > > > > the
> >> > > > > > > same
> >> > > > > > > > > for
> >> > > > > > > > > > > > the topic name? The difference is that sending the
> >> > regex
> >> > > is
> >> > > > > more
> >> > > > > > > > > efficient
> >> > > > > > > > > > > > whereas sending the topic names is less efficient.
> >> > > > > Furthermore,
> >> > > > > > > > > delete and
> >> > > > > > > > > > > > recreation is easier to handle if we have topic
> ids.
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > The main reason to still let the clients send names
> >> is to
> >> > > keep
> >> > > > > the
> >> > > > > > > > > > > reasoning of names -> ids on the broker / admin
> client
> >> > > only.
> >> > > > > Note
> >> > > > > > > that
> >> > > > > > > > > > > although we added topic id in KIP-516, we never
> >> > > implemented the
> >> > > > > > > logic
> >> > > > > > > > > on
> >> > > > > > > > > > > consumer/producers leveraging the related newer
> >> versioned
> >> > > RPCs,
> >> > > > > > > > > instead we
> >> > > > > > > > > > > just set the topic id as empty UUID. We want to keep
> >> the
> >> > > > > > > > > consumer/producer
> >> > > > > > > > > > > to be thin and only delegate the reasoning on broker
> >> and
> >> > > > > > > potentially
> >> > > > > > > > > admin
> >> > > > > > > > > > > clients.
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > Thanks,
> >> > > > > > > > > > > > Ismael
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > On Wed, Jul 6, 2022 at 10:45 AM David Jacot
> >> > > > > > > > > <dja...@confluent.io.invalid>
> >> > > > > > > > > > > > wrote:
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > > Hi all,
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > I would like to start a discussion thread on
> >> KIP-848:
> >> > > The
> >> > > > > Next
> >> > > > > > > > > > > > > Generation of the Consumer Rebalance Protocol.
> >> With
> >> > > this
> >> > > > > KIP,
> >> > > > > > > we
> >> > > > > > > > > aim
> >> > > > > > > > > > > > > to make the rebalance protocol (for consumers)
> >> more
> >> > > > > reliable,
> >> > > > > > > more
> >> > > > > > > > > > > > > scalable, easier to implement for clients, and
> >> easier
> >> > > to
> >> > > > > debug
> >> > > > > > > for
> >> > > > > > > > > > > > > operators.
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > The KIP is here:
> >> > > > > https://cwiki.apache.org/confluence/x/HhD1D.
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > Please take a look and let me know what you
> think.
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > Best,
> >> > > > > > > > > > > > > David
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > PS: I will be away from July 18th to August 8th.
> >> That
> >> > > gives
> >> > > > > > > you a
> >> > > > > > > > > bit
> >> > > > > > > > > > > > > of time to read and digest this long KIP.
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > --
> >> > > > > > > > > > > -- Guozhang
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > > >
> >> > > > > > > > --
> >> > > > > > > > -- Guozhang
> >> > > > > > >
> >> > > > >
> >> > >
> >> > >
> >> >
> >>
> >>
> >> --
> >> -- Guozhang
> >>
> >
>

Reply via email to