Hello David,

One of Jun's comments make me thinking:

```
In this case, a new assignment is triggered by the client side
assignor. When constructing the HB, the consumer will always consult
the client side assignor and propagate the information to the group
coordinator. In other words, we don't expect users to call
Consumer#enforceRebalance anymore.
```

As I looked at the current PartitionAssignor's interface, we actually do
not have a way yet to instruct how to construct the next HB request, e.g.
when the assignor wants to enforce a new rebalance with a new assignment,
we'd need some customizable APIs inside the PartitionAssignor to indicate
the next HB telling broker about so. WDYT about adding such an API on the
PartitionAssignor?


Guozhang


On Tue, Sep 6, 2022 at 6:09 AM David Jacot <dja...@confluent.io.invalid>
wrote:

> Hi Jun,
>
> I have updated the KIP to include your feedback. I have also tried to
> clarify the parts which were not cleared.
>
> Best,
> David
>
> On Fri, Sep 2, 2022 at 4:18 PM David Jacot <dja...@confluent.io> wrote:
> >
> > Hi Jun,
> >
> > Thanks for your feedback. Let me start by answering your questions
> > inline and I will update the KIP next week.
> >
> > > Thanks for the KIP. Overall, the main benefits of the KIP seem to be
> fewer
> > > RPCs during rebalance and more efficient support of wildcard. A few
> > > comments below.
> >
> > I would also add that the KIP removes the global sync barrier in the
> > protocol which is essential to improve group stability and
> > scalability, and the KIP also simplifies the client by moving most of
> > the logic to the server side.
> >
> > > 30. ConsumerGroupHeartbeatRequest
> > > 30.1 ServerAssignor is a singleton. Do we plan to support rolling
> changing
> > > of the partition assignor in the consumers?
> >
> > Definitely. The group coordinator will use the assignor used by a
> > majority of the members. This allows the group to move from one
> > assignor to another by a roll. This is explained in the Assignor
> > Selection chapter.
> >
> > > 30.2 For each field, could you explain whether it's required in every
> > > request or the scenarios when it needs to be filled? For example, it's
> not
> > > clear to me when TopicPartitions needs to be filled.
> >
> > The client is expected to set those fields in case of a connection
> > issue (e.g. timeout) or when the fields have changed since the last
> > HB. The server populates those fields as long as the member is not
> > fully reconciled - the member should acknowledge that it has the
> > expected epoch and assignment. I will clarify this in the KIP.
> >
> > > 31. In the current consumer protocol, the rack affinity between the
> client
> > > and the broker is only considered during fetching, but not during
> assigning
> > > partitions to consumers. Sometimes, once the assignment is made, there
> is
> > > no opportunity for read affinity because no replicas of assigned
> partitions
> > > are close to the member. I am wondering if we should use this
> opportunity
> > > to address this by including rack in GroupMember.
> >
> > That's an interesting idea. I don't see any issue with adding the rack
> > to the members. I will do so.
> >
> > > 32. On the metric side, often, it's useful to know how busy a group
> > > coordinator is. By moving the event loop model, it seems that we could
> add
> > > a metric that tracks the fraction of the time the event loop is doing
> the
> > > actual work.
> >
> > That's a great idea. I will add it. Thanks.
> >
> > > 33. Could we add a section on coordinator failover handling? For
> example,
> > > does it need to trigger the check if any group with the wildcard
> > > subscription now has a new matching topic?
> >
> > Sure. When the new group coordinator takes over, it has to:
> > * Setup the session timeouts.
> > * Trigger a new assignment if a client side assignor is used. We don't
> > store the information about the member selected to run the assignment
> > so we have to start a new one.
> > * Update the topics metadata, verify the wildcard subscriptions, and
> > trigger a rebalance if needed.
> >
> > > 34. ConsumerGroupMetadataValue, ConsumerGroupPartitionMetadataValue,
> > > ConsumerGroupMemberMetadataValue: Could we document what the epoch
> field
> > > reflects? For example, does the epoch in ConsumerGroupMetadataValue
> reflect
> > > the latest group epoch? What about the one in
> > > ConsumerGroupPartitionMetadataValue and
> ConsumerGroupMemberMetadataValue?
> >
> > Sure. I will clarify that but it is always the latest group epoch.
> > When the group state is updated, the group epoch is bumped so we use
> > that one for all the change records related to the update.
> >
> > > 35. "the group coordinator will ensure that the following invariants
> are
> > > met: ... All members exists." It's possible for a member not to get any
> > > assigned partitions, right?
> >
> > That's right. Here I meant that the members provided by the assignor
> > in the assignment must exist in the group. The assignor can not make
> > up new member ids.
> >
> > > 36. "He can rejoins the group with a member epoch equals to 0": When
> would
> > > a consumer rejoin and what member id would be used?
> >
> > A member is expected to abandon all its partitions and rejoins when it
> > receives the FENCED_MEMBER_EPOCH error. In this case, the group
> > coordinator will have removed the member from the group. The member
> > can rejoin the group with the same member id but with 0 as epoch. Let
> > me see if I can clarify this in the KIP.
> >
> > > 37. "Instead, power users will have the ability to trigger a
> reassignment
> > > by either providing a non-zero reason or by updating the assignor
> > > metadata." Hmm, this seems to be conflicting with the deprecation of
> > > Consumer#enforeRebalance.
> >
> > In this case, a new assignment is triggered by the client side
> > assignor. When constructing the HB, the consumer will always consult
> > the client side assignor and propagate the information to the group
> > coordinator. In other words, we don't expect users to call
> > Consumer#enforceRebalance anymore.
> >
> > > 38. The reassignment examples are nice. But the section seems to have
> > > multiple typos.
> > > 38.1 When the group transitions to epoch 2, B immediately gets into
> > > "epoch=1, partitions=[foo-2]", which seems incorrect.
> > > 38.2 When the group transitions to epoch 3, C seems to get into
> epoch=3,
> > > partitions=[foo-1] too early.
> > > 38.3 After A transitions to epoch 3, C still has A - epoch=2,
> > > partitions=[foo-0].
> >
> > Sorry for that! I will revise them.
> >
> > > 39. Rolling upgrade of consumers: Do we support the upgrade from any
> old
> > > version to new one?
> >
> > We will support upgrading from the consumer protocol version 3,
> > introduced in KIP-792. KIP-792 is not implemented yet so the earliest
> > version is unknown at the moment. This is explained in the migration
> > plan chapter.
> >
> > Thanks again for your feedback, Jun. I will update the KIP based on it
> > next week.
> >
> > Best,
> > David
> >
> > On Thu, Sep 1, 2022 at 9:07 PM Jun Rao <j...@confluent.io.invalid> wrote:
> > >
> > > Hi, David,
> > >
> > > Thanks for the KIP. Overall, the main benefits of the KIP seem to be
> fewer
> > > RPCs during rebalance and more efficient support of wildcard. A few
> > > comments below.
> > >
> > > 30. ConsumerGroupHeartbeatRequest
> > > 30.1 ServerAssignor is a singleton. Do we plan to support rolling
> changing
> > > of the partition assignor in the consumers?
> > > 30.2 For each field, could you explain whether it's required in every
> > > request or the scenarios when it needs to be filled? For example, it's
> not
> > > clear to me when TopicPartitions needs to be filled.
> > >
> > > 31. In the current consumer protocol, the rack affinity between the
> client
> > > and the broker is only considered during fetching, but not during
> assigning
> > > partitions to consumers. Sometimes, once the assignment is made, there
> is
> > > no opportunity for read affinity because no replicas of assigned
> partitions
> > > are close to the member. I am wondering if we should use this
> opportunity
> > > to address this by including rack in GroupMember.
> > >
> > > 32. On the metric side, often, it's useful to know how busy a group
> > > coordinator is. By moving the event loop model, it seems that we could
> add
> > > a metric that tracks the fraction of the time the event loop is doing
> the
> > > actual work.
> > >
> > > 33. Could we add a section on coordinator failover handling? For
> example,
> > > does it need to trigger the check if any group with the wildcard
> > > subscription now has a new matching topic?
> > >
> > > 34. ConsumerGroupMetadataValue, ConsumerGroupPartitionMetadataValue,
> > > ConsumerGroupMemberMetadataValue: Could we document what the epoch
> field
> > > reflects? For example, does the epoch in ConsumerGroupMetadataValue
> reflect
> > > the latest group epoch? What about the one in
> > > ConsumerGroupPartitionMetadataValue and
> ConsumerGroupMemberMetadataValue?
> > >
> > > 35. "the group coordinator will ensure that the following invariants
> are
> > > met: ... All members exists." It's possible for a member not to get any
> > > assigned partitions, right?
> > >
> > > 36. "He can rejoins the group with a member epoch equals to 0": When
> would
> > > a consumer rejoin and what member id would be used?
> > >
> > > 37. "Instead, power users will have the ability to trigger a
> reassignment
> > > by either providing a non-zero reason or by updating the assignor
> > > metadata." Hmm, this seems to be conflicting with the deprecation of
> > > Consumer#enforeRebalance.
> > >
> > > 38. The reassignment examples are nice. But the section seems to have
> > > multiple typos.
> > > 38.1 When the group transitions to epoch 2, B immediately gets into
> > > "epoch=1, partitions=[foo-2]", which seems incorrect.
> > > 38.2 When the group transitions to epoch 3, C seems to get into
> epoch=3,
> > > partitions=[foo-1] too early.
> > > 38.3 After A transitions to epoch 3, C still has A - epoch=2,
> > > partitions=[foo-0].
> > >
> > > 39. Rolling upgrade of consumers: Do we support the upgrade from any
> old
> > > version to new one?
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Mon, Aug 29, 2022 at 9:20 AM David Jacot
> <dja...@confluent.io.invalid>
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > The KIP states that we will re-implement the coordinator in Java. I
> > > > discussed this offline with a few folks and folks are concerned that
> > > > we could introduce many regressions in the old protocol if we do so.
> > > > Therefore, I am going to remove this statement from the KIP. It is an
> > > > implementation detail after all so it does not have to be decided at
> > > > this stage. We will likely start by trying to refactor the current
> > > > implementation as a first step.
> > > >
> > > > Cheers,
> > > > David
> > > >
> > > > On Mon, Aug 29, 2022 at 3:52 PM David Jacot <dja...@confluent.io>
> wrote:
> > > > >
> > > > > Hi Luke,
> > > > >
> > > > > > 1.1. I think the state machine are: "Empty, assigning,
> reconciling,
> > > > stable,
> > > > > > dead" mentioned in Consumer Group States section, right?
> > > > >
> > > > > This sentence does not refer to those group states but rather to a
> > > > > state machine replication (SMR). This refers to the entire state of
> > > > > group coordinator which is replicated via the log layer. I will
> > > > > clarify this in the KIP.
> > > > >
> > > > > > 1.2. What do you mean "each state machine is modelled as an event
> > > > loop"?
> > > > >
> > > > > The idea is to follow a model similar to the new quorum
> controller. We
> > > > > will have N threads to process events. Each __consumer_offsets
> > > > > partition is assigned to a unique thread and all the events (e.g.
> > > > > requests, callbacks, etc.) are processed by this thread. This
> simplify
> > > > > concurrency and will enable us to do simulation testing for the
> group
> > > > > coordinator.
> > > > >
> > > > > > 1.3. Why do we need a state machine per *__consumer_offsets*
> > > > partitions?
> > > > > > Not a state machine "per consumer group" owned by a group
> coordinator?
> > > > For
> > > > > > example, if one group coordinator owns 2 consumer groups, and
> both
> > > > exist in
> > > > > > *__consumer_offsets-0*, will we have 1 state machine for it, or
> 2?
> > > > >
> > > > > See 1.1. The confusion comes from there, I think.
> > > > >
> > > > > > 1.4. I know the "*group.coordinator.threads" *is the number of
> threads
> > > > used
> > > > > > to run the state machines. But I'm wondering if the purpose of
> the
> > > > threads
> > > > > > is only to keep the state of each consumer group (or
> > > > *__consumer_offsets*
> > > > > > partitions?), and no heavy computation, why should we need
> > > > multi-threads
> > > > > > here?
> > > > >
> > > > > See 1.2. The idea is to have an ability to shard the processing as
> the
> > > > > computation could be heavy.
> > > > >
> > > > > > 2.1. The consumer session timeout, why does the default session
> > > > timeout not
> > > > > > locate between min (45s) and max(60s)? I thought the min/max
> session
> > > > > > timeout is to define lower/upper bound of it, no?
> > > > > >
> > > > > > group.consumer.session.timeout.ms int 30s The timeout to detect
> client
> > > > > > failures when using the consumer group protocol.
> > > > > > group.consumer.min.session.timeout.ms int 45s The minimum
> session
> > > > timeout.
> > > > > > group.consumer.max.session.timeout.ms int 60s The maximum
> session
> > > > timeout.
> > > > >
> > > > > This is indeed a mistake. The default session timeout should be 45s
> > > > > (the current default).
> > > > >
> > > > > > 2.2. The default server side assignor are [range, uniform],
> which means
> > > > > > we'll default to "range" assignor. I'd like to know why not
> uniform
> > > > one? I
> > > > > > thought usually users will choose uniform assignor (former sticky
> > > > assinor)
> > > > > > for better evenly distribution. Any other reason we choose range
> > > > assignor
> > > > > > as default?
> > > > > > group.consumer.assignors List range, uniform The server side
> assignors.
> > > > >
> > > > > The order on the server side has no influence because the client
> must
> > > > > chose the selector that he wants to use. There is no default in the
> > > > > current proposal. If the assignor is not specified by the client,
> the
> > > > > request is rejected. The default client value for
> > > > > `group.remote.assignor` is `uniform` though.
> > > > >
> > > > > Thanks for your very good comments, Luke. I hope that my answers
> help
> > > > > to clarify things. I will update the KIP as well based on your
> > > > > feedback.
> > > > >
> > > > > Cheers,
> > > > > David
> > > > >
> > > > > On Mon, Aug 22, 2022 at 9:29 AM Luke Chen <show...@gmail.com>
> wrote:
> > > > > >
> > > > > > Hi David,
> > > > > >
> > > > > > Thanks for the update.
> > > > > >
> > > > > > Some more questions:
> > > > > > 1. In Group Coordinator section, you mentioned:
> > > > > > > The new group coordinator will have a state machine per
> > > > > > *__consumer_offsets* partitions, where each state machine is
> modelled
> > > > as an
> > > > > > event loop. Those state machines will be executed in
> > > > > > *group.coordinator.threads* threads.
> > > > > >
> > > > > > 1.1. I think the state machine are: "Empty, assigning,
> reconciling,
> > > > stable,
> > > > > > dead" mentioned in Consumer Group States section, right?
> > > > > > 1.2. What do you mean "each state machine is modelled as an event
> > > > loop"?
> > > > > > 1.3. Why do we need a state machine per *__consumer_offsets*
> > > > partitions?
> > > > > > Not a state machine "per consumer group" owned by a group
> coordinator?
> > > > For
> > > > > > example, if one group coordinator owns 2 consumer groups, and
> both
> > > > exist in
> > > > > > *__consumer_offsets-0*, will we have 1 state machine for it, or
> 2?
> > > > > > 1.4. I know the "*group.coordinator.threads" *is the number of
> threads
> > > > used
> > > > > > to run the state machines. But I'm wondering if the purpose of
> the
> > > > threads
> > > > > > is only to keep the state of each consumer group (or
> > > > *__consumer_offsets*
> > > > > > partitions?), and no heavy computation, why should we need
> > > > multi-threads
> > > > > > here?
> > > > > >
> > > > > > 2. For the default value in the new configs:
> > > > > > 2.1. The consumer session timeout, why does the default session
> > > > timeout not
> > > > > > locate between min (45s) and max(60s)? I thought the min/max
> session
> > > > > > timeout is to define lower/upper bound of it, no?
> > > > > >
> > > > > > group.consumer.session.timeout.ms int 30s The timeout to detect
> client
> > > > > > failures when using the consumer group protocol.
> > > > > > group.consumer.min.session.timeout.ms int 45s The minimum
> session
> > > > timeout.
> > > > > > group.consumer.max.session.timeout.ms int 60s The maximum
> session
> > > > timeout.
> > > > > >
> > > > > >
> > > > > >
> > > > > > 2.2. The default server side assignor are [range, uniform],
> which means
> > > > > > we'll default to "range" assignor. I'd like to know why not
> uniform
> > > > one? I
> > > > > > thought usually users will choose uniform assignor (former sticky
> > > > assinor)
> > > > > > for better evenly distribution. Any other reason we choose range
> > > > assignor
> > > > > > as default?
> > > > > > group.consumer.assignors List range, uniform The server side
> assignors.
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > Thank you.
> > > > > > Luke
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Mon, Aug 22, 2022 at 2:10 PM Luke Chen <show...@gmail.com>
> wrote:
> > > > > >
> > > > > > > Hi Sagar,
> > > > > > >
> > > > > > > I have some thoughts about Kafka Connect integrating with
> KIP-848,
> > > > but I
> > > > > > > think we should have a separate discussion thread for the Kafka
> > > > Connect
> > > > > > > KIP: Integrating Kafka Connect With New Consumer Rebalance
> Protocol
> > > > [1],
> > > > > > > and let this discussion thread focus on consumer rebalance
> protocol,
> > > > WDYT?
> > > > > > >
> > > > > > > [1]
> > > > > > >
> > > >
> https://cwiki.apache.org/confluence/display/KAFKA/%5BDRAFT%5DIntegrating+Kafka+Connect+With+New+Consumer+Rebalance+Protocol
> > > > > > >
> > > > > > > Thank you.
> > > > > > > Luke
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Aug 12, 2022 at 9:31 PM Sagar <
> sagarmeansoc...@gmail.com>
> > > > wrote:
> > > > > > >
> > > > > > >> Thank you Guozhang/David for the feedback. Looks like there's
> > > > agreement on
> > > > > > >> using separate APIs for Connect. I would revisit the doc and
> see
> > > > what
> > > > > > >> changes are to be made.
> > > > > > >>
> > > > > > >> Thanks!
> > > > > > >> Sagar.
> > > > > > >>
> > > > > > >> On Tue, Aug 9, 2022 at 7:11 PM David Jacot
> > > > <dja...@confluent.io.invalid>
> > > > > > >> wrote:
> > > > > > >>
> > > > > > >> > Hi Sagar,
> > > > > > >> >
> > > > > > >> > Thanks for the feedback and the document. That's really
> helpful. I
> > > > > > >> > will take a look at it.
> > > > > > >> >
> > > > > > >> > Overall, it seems to me that both Connect and the Consumer
> could
> > > > share
> > > > > > >> > the same underlying "engine". The main difference is that
> the
> > > > Consumer
> > > > > > >> > assigns topic-partitions to members whereas Connect assigns
> tasks
> > > > to
> > > > > > >> > workers. I see two ways to move forward:
> > > > > > >> > 1) We extend the new proposed APIs to support different
> resource
> > > > types
> > > > > > >> > (e.g. partitions, tasks, etc.); or
> > > > > > >> > 2) We use new dedicated APIs for Connect. The dedicated APIs
> > > > would be
> > > > > > >> > similar to the new ones but different on the
> content/resources and
> > > > > > >> > they would rely on the same engine on the coordinator side.
> > > > > > >> >
> > > > > > >> > I personally lean towards 2) because I am not a fan of
> > > > overcharging
> > > > > > >> > APIs to serve different purposes. That being said, I am not
> > > > opposed to
> > > > > > >> > 1) if we can find an elegant way to do it.
> > > > > > >> >
> > > > > > >> > I think that we can continue to discuss it here for now in
> order
> > > > to
> > > > > > >> > ensure that this KIP is compatible with what we will do for
> > > > Connect in
> > > > > > >> > the future.
> > > > > > >> >
> > > > > > >> > Best,
> > > > > > >> > David
> > > > > > >> >
> > > > > > >> > On Mon, Aug 8, 2022 at 2:41 PM David Jacot <
> dja...@confluent.io>
> > > > wrote:
> > > > > > >> > >
> > > > > > >> > > Hi all,
> > > > > > >> > >
> > > > > > >> > > I am back from vacation. I will go through and address
> your
> > > > comments
> > > > > > >> > > in the coming days. Thanks for your feedback.
> > > > > > >> > >
> > > > > > >> > > Cheers,
> > > > > > >> > > David
> > > > > > >> > >
> > > > > > >> > > On Wed, Aug 3, 2022 at 10:05 PM Gregory Harris <
> > > > gharris1...@gmail.com
> > > > > > >> >
> > > > > > >> > wrote:
> > > > > > >> > > >
> > > > > > >> > > > Hey All!
> > > > > > >> > > >
> > > > > > >> > > > Thanks for the KIP, it's wonderful to see cooperative
> > > > rebalancing
> > > > > > >> > making it
> > > > > > >> > > > down the stack!
> > > > > > >> > > >
> > > > > > >> > > > I had a few questions:
> > > > > > >> > > >
> > > > > > >> > > > 1. The 'Rejected Alternatives' section describes how
> member
> > > > epoch
> > > > > > >> > should
> > > > > > >> > > > advance in step with the group epoch and assignment
> epoch
> > > > values. I
> > > > > > >> > think
> > > > > > >> > > > that this is a good idea for the reasons described in
> the
> > > > KIP. When
> > > > > > >> the
> > > > > > >> > > > protocol is incrementally assigning partitions to a
> worker,
> > > > what
> > > > > > >> member
> > > > > > >> > > > epoch does each incremental assignment use? Are member
> epochs
> > > > > > >> re-used,
> > > > > > >> > and
> > > > > > >> > > > a single member epoch can correspond to multiple
> different
> > > > > > >> > (monotonically
> > > > > > >> > > > larger) assignments?
> > > > > > >> > > >
> > > > > > >> > > > 2. Is the Assignor's 'Reason' field opaque to the group
> > > > > > >> coordinator? If
> > > > > > >> > > > not, should custom client-side assignor implementations
> > > > interact
> > > > > > >> with
> > > > > > >> > the
> > > > > > >> > > > Reason field, and how is its common meaning agreed
> upon? If
> > > > so, what
> > > > > > >> > is the
> > > > > > >> > > > benefit of a distinct Reason field over including such
> > > > functionality
> > > > > > >> > in the
> > > > > > >> > > > opaque metadata?
> > > > > > >> > > >
> > > > > > >> > > > 3. The following is included in the KIP: "Thanks to
> this, the
> > > > input
> > > > > > >> of
> > > > > > >> > the
> > > > > > >> > > > client side assignor is entirely driven by the group
> > > > coordinator.
> > > > > > >> The
> > > > > > >> > > > consumer is no longer responsible for maintaining any
> state
> > > > besides
> > > > > > >> its
> > > > > > >> > > > assigned partitions." Does this mean that the
> client-side
> > > > assignor
> > > > > > >> MAY
> > > > > > >> > > > incorporate additional non-Metadata state (such as
> partition
> > > > > > >> > throughput,
> > > > > > >> > > > cpu/memory metrics, config topics, etc), or that
> additional
> > > > > > >> > non-Metadata
> > > > > > >> > > > state SHOULD NOT be used?
> > > > > > >> > > >
> > > > > > >> > > > 4. I see that there are separate classes
> > > > > > >> > > > for
> org.apache.kafka.server.group.consumer.PartitionAssignor
> > > > > > >> > > > and org.apache.kafka.clients.consumer.PartitionAssignor
> that
> > > > seem to
> > > > > > >> > > > overlap significantly. Is it possible for these two
> > > > implementations
> > > > > > >> to
> > > > > > >> > be
> > > > > > >> > > > unified? This would serve to promote feature parity of
> > > > server-side
> > > > > > >> and
> > > > > > >> > > > client-side assignors, and would also facilitate
> operational
> > > > > > >> > flexibility in
> > > > > > >> > > > certain situations. For example, if a server-side
> assignor
> > > > has some
> > > > > > >> > poor
> > > > > > >> > > > behavior and needs a patch, deploying the patched
> assignor to
> > > > the
> > > > > > >> > client
> > > > > > >> > > > and switching one consumer group to a client-side
> assignor
> > > > may be
> > > > > > >> > faster
> > > > > > >> > > > and less risky than patching all of the brokers. With
> the
> > > > currently
> > > > > > >> > > > proposed distinct APIs, a non-trivial reimplementation
> would
> > > > have
> > > > > > >> to be
> > > > > > >> > > > assembled, and if the two APIs have diverged
> significantly,
> > > > then it
> > > > > > >> is
> > > > > > >> > > > possible that a reimplementation would not be possible.
> > > > > > >> > > >
> > > > > > >> > > > --
> > > > > > >> > > > Greg Harris
> > > > > > >> > > > gharris1...@gmail.com
> > > > > > >> > > > github.com/gharris1727
> > > > > > >> > > >
> > > > > > >> > > > On Wed, Aug 3, 2022 at 8:39 AM Sagar <
> > > > sagarmeansoc...@gmail.com>
> > > > > > >> > wrote:
> > > > > > >> > > >
> > > > > > >> > > > > Hi Guozhang/David,
> > > > > > >> > > > >
> > > > > > >> > > > > I created a confluence page to discuss how Connect
> would
> > > > need to
> > > > > > >> > change
> > > > > > >> > > > > based on the new rebalance protocol. Here's the page:
> > > > > > >> > > > >
> > > > > > >> > > > >
> > > > > > >> >
> > > > > > >>
> > > >
> https://cwiki.apache.org/confluence/display/KAFKA/%5BDRAFT%5DIntegrating+Kafka+Connect+With+New+Consumer+Rebalance+Protocol
> > > > > > >> > > > >
> > > > > > >> > > > > It's also pretty longish and I have tried to keep a
> format
> > > > > > >> similar to
> > > > > > >> > > > > KIP-848. Let me know what you think. Also, do you
> think this
> > > > > > >> should
> > > > > > >> > be
> > > > > > >> > > > > moved to a separate discussion thread or is this one
> fine?
> > > > > > >> > > > >
> > > > > > >> > > > > Thanks!
> > > > > > >> > > > > Sagar.
> > > > > > >> > > > >
> > > > > > >> > > > > On Tue, Jul 26, 2022 at 7:37 AM Sagar <
> > > > sagarmeansoc...@gmail.com>
> > > > > > >> > wrote:
> > > > > > >> > > > >
> > > > > > >> > > > > > Hello Guozhang,
> > > > > > >> > > > > >
> > > > > > >> > > > > > Thank you so much for the doc on Kafka Streams.
> Sure, I
> > > > would do
> > > > > > >> > the
> > > > > > >> > > > > > analysis and come up with such a document.
> > > > > > >> > > > > >
> > > > > > >> > > > > > Thanks!
> > > > > > >> > > > > > Sagar.
> > > > > > >> > > > > >
> > > > > > >> > > > > > On Tue, Jul 26, 2022 at 4:47 AM Guozhang Wang <
> > > > > > >> wangg...@gmail.com>
> > > > > > >> > > > > wrote:
> > > > > > >> > > > > >
> > > > > > >> > > > > >> Hello Sagar,
> > > > > > >> > > > > >>
> > > > > > >> > > > > >> It would be great if you could come back with some
> > > > analysis on
> > > > > > >> > how to
> > > > > > >> > > > > >> implement the Connect side integration with the new
> > > > protocol;
> > > > > > >> so
> > > > > > >> > far
> > > > > > >> > > > > >> besides leveraging on the new "protocol type" we
> did not
> > > > yet
> > > > > > >> think
> > > > > > >> > > > > through
> > > > > > >> > > > > >> the Connect side implementations. For Streams
> here's a
> > > > draft of
> > > > > > >> > > > > >> integration
> > > > > > >> > > > > >> plan:
> > > > > > >> > > > > >>
> > > > > > >> > > > > >>
> > > > > > >> > > > >
> > > > > > >> >
> > > > > > >>
> > > >
> https://docs.google.com/document/d/17PNz2sGoIvGyIzz8vLyJTJTU2rqnD_D9uHJnH9XARjU/edit#heading=h.pdgirmi57dvn
> > > > > > >> > > > > >> just FYI for your analysis on Connect.
> > > > > > >> > > > > >>
> > > > > > >> > > > > >> On Tue, Jul 19, 2022 at 10:48 PM Sagar <
> > > > > > >> sagarmeansoc...@gmail.com
> > > > > > >> > >
> > > > > > >> > > > > wrote:
> > > > > > >> > > > > >>
> > > > > > >> > > > > >> > Hi David,
> > > > > > >> > > > > >> >
> > > > > > >> > > > > >> > Thank you for your response. The reason I thought
> > > > connect can
> > > > > > >> > also fit
> > > > > > >> > > > > >> into
> > > > > > >> > > > > >> > this new scheme is that even today the connect
> uses a
> > > > > > >> > > > > WorkerCoordinator
> > > > > > >> > > > > >> > extending from AbstractCoordinator to empower
> > > > rebalances of
> > > > > > >> > > > > >> > tasks/connectors. The WorkerCoordinator sets the
> > > > > > >> protocolType()
> > > > > > >> > to
> > > > > > >> > > > > >> connect
> > > > > > >> > > > > >> > and uses the metadata() method by plumbing into
> > > > > > >> > > > > >> JoinGroupRequestProtocol.
> > > > > > >> > > > > >> >
> > > > > > >> > > > > >> > I think the changes to support connect would be
> > > > similar at a
> > > > > > >> > high
> > > > > > >> > > > > level
> > > > > > >> > > > > >> to
> > > > > > >> > > > > >> > the changes in streams mainly because of the
> Client
> > > > side
> > > > > > >> > assignors
> > > > > > >> > > > > being
> > > > > > >> > > > > >> > used in both. At an implementation level, we
> might
> > > > need to
> > > > > > >> make
> > > > > > >> > a lot
> > > > > > >> > > > > of
> > > > > > >> > > > > >> > changes to get onto this new assignment protocol
> like
> > > > > > >> enhancing
> > > > > > >> > the
> > > > > > >> > > > > >> > JoinGroup request/response and SyncGroup and
> using
> > > > > > >> > > > > >> ConsumerGroupHeartbeat
> > > > > > >> > > > > >> > API etc again on similar lines to streams (or
> there
> > > > might be
> > > > > > >> > > > > >> deviations). I
> > > > > > >> > > > > >> > would try to perform a detailed analysis of the
> same
> > > > and we
> > > > > > >> > can have
> > > > > > >> > > > > a
> > > > > > >> > > > > >> > separate discussion thread for that as that would
> > > > derail this
> > > > > > >> > > > > discussion
> > > > > > >> > > > > >> > thread. Let me know if that sounds good to you.
> > > > > > >> > > > > >> >
> > > > > > >> > > > > >> > Thanks!
> > > > > > >> > > > > >> > Sagar.
> > > > > > >> > > > > >> >
> > > > > > >> > > > > >> >
> > > > > > >> > > > > >> >
> > > > > > >> > > > > >> > On Fri, Jul 15, 2022 at 5:47 PM David Jacot
> > > > > > >> > > > > <dja...@confluent.io.invalid
> > > > > > >> > > > > >> >
> > > > > > >> > > > > >> > wrote:
> > > > > > >> > > > > >> >
> > > > > > >> > > > > >> > > Hi Sagar,
> > > > > > >> > > > > >> > >
> > > > > > >> > > > > >> > > Thanks for your comments.
> > > > > > >> > > > > >> > >
> > > > > > >> > > > > >> > > 1) Yes. That refers to `Assignment#error`.
> Sure, I
> > > > can
> > > > > > >> > mention it.
> > > > > > >> > > > > >> > >
> > > > > > >> > > > > >> > > 2) The idea is to transition C from his current
> > > > assignment
> > > > > > >> to
> > > > > > >> > his
> > > > > > >> > > > > >> > > target assignment when he can move to epoch 3.
> When
> > > > that
> > > > > > >> > happens,
> > > > > > >> > > > > the
> > > > > > >> > > > > >> > > member assignment is updated and persisted
> with all
> > > > its
> > > > > > >> > assigned
> > > > > > >> > > > > >> > > partitions even if they are not all revoked
> yet. In
> > > > other
> > > > > > >> > words, the
> > > > > > >> > > > > >> > > member assignment becomes the target
> assignment.
> > > > This is
> > > > > > >> > basically
> > > > > > >> > > > > an
> > > > > > >> > > > > >> > > optimization to avoid having to write all the
> > > > changes to
> > > > > > >> the
> > > > > > >> > log.
> > > > > > >> > > > > The
> > > > > > >> > > > > >> > > examples are based on the persisted state so I
> > > > understand
> > > > > > >> the
> > > > > > >> > > > > >> > > confusion. Let me see if I can improve this in
> the
> > > > > > >> > description.
> > > > > > >> > > > > >> > >
> > > > > > >> > > > > >> > > 3) Regarding Connect, it could reuse the
> protocol
> > > > with a
> > > > > > >> > client side
> > > > > > >> > > > > >> > > assignor if it fits in the protocol. The
> assignment
> > > > is
> > > > > > >> about
> > > > > > >> > > > > >> > > topicid-partitions + metadata, could Connect
> fit
> > > > into this?
> > > > > > >> > > > > >> > >
> > > > > > >> > > > > >> > > Best,
> > > > > > >> > > > > >> > > David
> > > > > > >> > > > > >> > >
> > > > > > >> > > > > >> > > On Fri, Jul 15, 2022 at 1:55 PM Sagar <
> > > > > > >> > sagarmeansoc...@gmail.com>
> > > > > > >> > > > > >> wrote:
> > > > > > >> > > > > >> > > >
> > > > > > >> > > > > >> > > > Hi David,
> > > > > > >> > > > > >> > > >
> > > > > > >> > > > > >> > > > Thanks for the KIP. I just had minor
> observations:
> > > > > > >> > > > > >> > > >
> > > > > > >> > > > > >> > > > 1) In the Assignment Error section in Client
> Side
> > > > mode
> > > > > > >> > Assignment
> > > > > > >> > > > > >> > > process,
> > > > > > >> > > > > >> > > > you mentioned => `In this case, the client
> side
> > > > assignor
> > > > > > >> can
> > > > > > >> > > > > return
> > > > > > >> > > > > >> an
> > > > > > >> > > > > >> > > > error to the group coordinator`. In this
> case are
> > > > you
> > > > > > >> > referring to
> > > > > > >> > > > > >> the
> > > > > > >> > > > > >> > > > Assignor returning an AssignmentError that's
> > > > listed down
> > > > > > >> > towards
> > > > > > >> > > > > the
> > > > > > >> > > > > >> > end?
> > > > > > >> > > > > >> > > > If yes, do you think it would make sense to
> > > > mention this
> > > > > > >> > > > > explicitly
> > > > > > >> > > > > >> > here?
> > > > > > >> > > > > >> > > >
> > > > > > >> > > > > >> > > > 2) In the Case Studies section, I have a
> slight
> > > > > > >> confusion,
> > > > > > >> > not
> > > > > > >> > > > > sure
> > > > > > >> > > > > >> if
> > > > > > >> > > > > >> > > > others have the same. Consider this step:
> > > > > > >> > > > > >> > > >
> > > > > > >> > > > > >> > > > When B heartbeats, the group coordinator
> > > > transitions him
> > > > > > >> to
> > > > > > >> > epoch
> > > > > > >> > > > > 3
> > > > > > >> > > > > >> > > because
> > > > > > >> > > > > >> > > > B has no partitions to revoke. It persists
> the
> > > > change and
> > > > > > >> > reply.
> > > > > > >> > > > > >> > > >
> > > > > > >> > > > > >> > > >    - Group (epoch=3)
> > > > > > >> > > > > >> > > >       - A
> > > > > > >> > > > > >> > > >       - B
> > > > > > >> > > > > >> > > >       - C
> > > > > > >> > > > > >> > > >    - Target Assignment (epoch=3)
> > > > > > >> > > > > >> > > >       - A - partitions=[foo-0]
> > > > > > >> > > > > >> > > >       - B - partitions=[foo-2]
> > > > > > >> > > > > >> > > >       - C - partitions=[foo-1]
> > > > > > >> > > > > >> > > >    - Member Assignment
> > > > > > >> > > > > >> > > >       - A - epoch=2, partitions=[foo-0,
> foo-1]
> > > > > > >> > > > > >> > > >       - B - epoch=3, partitions=[foo-2]
> > > > > > >> > > > > >> > > >       - C - epoch=3, partitions=[foo-1]
> > > > > > >> > > > > >> > > >
> > > > > > >> > > > > >> > > > When C heartbeats, it transitions to epoch 3
> but
> > > > cannot
> > > > > > >> get
> > > > > > >> > foo-1
> > > > > > >> > > > > >> yet.
> > > > > > >> > > > > >> > > >
> > > > > > >> > > > > >> > > > Here,it's mentioned that member C can't get
> the
> > > > foo-1
> > > > > > >> > partition
> > > > > > >> > > > > yet,
> > > > > > >> > > > > >> > but
> > > > > > >> > > > > >> > > > based on the description above, it seems it
> > > > already has
> > > > > > >> it.
> > > > > > >> > Do you
> > > > > > >> > > > > >> > think
> > > > > > >> > > > > >> > > it
> > > > > > >> > > > > >> > > > would be better to remove it and populate it
> only
> > > > when it
> > > > > > >> > actually
> > > > > > >> > > > > >> gets
> > > > > > >> > > > > >> > > it?
> > > > > > >> > > > > >> > > > I see this in a lot of other places, so have
> I
> > > > > > >> understood it
> > > > > > >> > > > > >> > incorrectly
> > > > > > >> > > > > >> > > ?
> > > > > > >> > > > > >> > > >
> > > > > > >> > > > > >> > > >
> > > > > > >> > > > > >> > > > Regarding connect , it might be out of scope
> of
> > > > this
> > > > > > >> > discussion,
> > > > > > >> > > > > but
> > > > > > >> > > > > >> > from
> > > > > > >> > > > > >> > > > what I understood it would probably be
> running in
> > > > client
> > > > > > >> > side
> > > > > > >> > > > > >> assignor
> > > > > > >> > > > > >> > > mode
> > > > > > >> > > > > >> > > > even on the new rebalance protocol as it has
> its
> > > > own
> > > > > > >> Custom
> > > > > > >> > > > > >> > > Assignors(Eager
> > > > > > >> > > > > >> > > > and IncrementalCooperative).
> > > > > > >> > > > > >> > > >
> > > > > > >> > > > > >> > > > Thanks!
> > > > > > >> > > > > >> > > >
> > > > > > >> > > > > >> > > > Sagar.
> > > > > > >> > > > > >> > > >
> > > > > > >> > > > > >> > > >
> > > > > > >> > > > > >> > > >
> > > > > > >> > > > > >> > > >
> > > > > > >> > > > > >> > > >
> > > > > > >> > > > > >> > > >
> > > > > > >> > > > > >> > > > On Fri, Jul 15, 2022 at 5:00 PM David Jacot
> > > > > > >> > > > > >> > <dja...@confluent.io.invalid
> > > > > > >> > > > > >> > > >
> > > > > > >> > > > > >> > > > wrote:
> > > > > > >> > > > > >> > > >
> > > > > > >> > > > > >> > > > > Thanks Hector! Our goal is to move forward
> with
> > > > > > >> > specialized API
> > > > > > >> > > > > >> > > > > instead of relying on one generic API. For
> > > > Connect, we
> > > > > > >> > can apply
> > > > > > >> > > > > >> the
> > > > > > >> > > > > >> > > > > exact same pattern and reuse/share the core
> > > > > > >> > implementation on
> > > > > > >> > > > > the
> > > > > > >> > > > > >> > > > > server side. For the schema registry, I
> think
> > > > that we
> > > > > > >> > should
> > > > > > >> > > > > >> consider
> > > > > > >> > > > > >> > > > > having a tailored API to do simple
> > > > membership/leader
> > > > > > >> > election.
> > > > > > >> > > > > >> > > > >
> > > > > > >> > > > > >> > > > > Best,
> > > > > > >> > > > > >> > > > > David
> > > > > > >> > > > > >> > > > >
> > > > > > >> > > > > >> > > > > On Fri, Jul 15, 2022 at 10:22 AM Ismael
> Juma <
> > > > > > >> > ism...@juma.me.uk
> > > > > > >> > > > > >
> > > > > > >> > > > > >> > > wrote:
> > > > > > >> > > > > >> > > > > >
> > > > > > >> > > > > >> > > > > > Three quick comments:
> > > > > > >> > > > > >> > > > > >
> > > > > > >> > > > > >> > > > > > 1. Regarding java.util.regex.Pattern vs
> > > > > > >> > > > > >> com.google.re2j.Pattern, we
> > > > > > >> > > > > >> > > > > should
> > > > > > >> > > > > >> > > > > > document the differences in more detail
> before
> > > > > > >> deciding
> > > > > > >> > one
> > > > > > >> > > > > way
> > > > > > >> > > > > >> or
> > > > > > >> > > > > >> > > > > another.
> > > > > > >> > > > > >> > > > > > That said, if people pass
> > > > java.util.regex.Pattern,
> > > > > > >> they
> > > > > > >> > expect
> > > > > > >> > > > > >> > their
> > > > > > >> > > > > >> > > > > > semantics to be honored. If we are doing
> > > > something
> > > > > > >> > different,
> > > > > > >> > > > > >> then
> > > > > > >> > > > > >> > we
> > > > > > >> > > > > >> > > > > > should consider adding an overload with
> our own
> > > > > > >> Pattern
> > > > > > >> > class
> > > > > > >> > > > > (I
> > > > > > >> > > > > >> > > don't
> > > > > > >> > > > > >> > > > > > think we'd want to expose re2j's at this
> > > > point).
> > > > > > >> > > > > >> > > > > > 2. Regarding topic ids, any major new
> protocol
> > > > should
> > > > > > >> > > > > integrate
> > > > > > >> > > > > >> > fully
> > > > > > >> > > > > >> > > > > with
> > > > > > >> > > > > >> > > > > > it and should handle the topic
> recreation case
> > > > > > >> > correctly.
> > > > > > >> > > > > That's
> > > > > > >> > > > > >> > the
> > > > > > >> > > > > >> > > main
> > > > > > >> > > > > >> > > > > > part we need to handle. I agree with
> David
> > > > that we'd
> > > > > > >> > want to
> > > > > > >> > > > > add
> > > > > > >> > > > > >> > > topic
> > > > > > >> > > > > >> > > > > ids
> > > > > > >> > > > > >> > > > > > to the relevant protocols that don't
> have it
> > > > yet and
> > > > > > >> > that we
> > > > > > >> > > > > can
> > > > > > >> > > > > >> > > probably
> > > > > > >> > > > > >> > > > > > focus on the internals versus adding new
> APIs
> > > > to the
> > > > > > >> > Java
> > > > > > >> > > > > >> Consumer
> > > > > > >> > > > > >> > > > > (unless
> > > > > > >> > > > > >> > > > > > we find that adding new APIs is required
> for
> > > > > > >> reasonable
> > > > > > >> > > > > >> semantics).
> > > > > > >> > > > > >> > > > > > 3. I am still not sure about the
> coordinator
> > > > storing
> > > > > > >> the
> > > > > > >> > > > > >> configs.
> > > > > > >> > > > > >> > > It's
> > > > > > >> > > > > >> > > > > > powerful for configs to be centralized
> in the
> > > > > > >> metadata
> > > > > > >> > log for
> > > > > > >> > > > > >> > > various
> > > > > > >> > > > > >> > > > > > reasons (auditability, visibility,
> consistency,
> > > > > > >> etc.).
> > > > > > >> > > > > >> Similarly, I
> > > > > > >> > > > > >> > > am
> > > > > > >> > > > > >> > > > > not
> > > > > > >> > > > > >> > > > > > sure about automatically deleting
> configs in a
> > > > way
> > > > > > >> that
> > > > > > >> > they
> > > > > > >> > > > > >> cannot
> > > > > > >> > > > > >> > > be
> > > > > > >> > > > > >> > > > > > recovered. A good property for modern
> systems
> > > > is to
> > > > > > >> > minimize
> > > > > > >> > > > > the
> > > > > > >> > > > > >> > > number
> > > > > > >> > > > > >> > > > > of
> > > > > > >> > > > > >> > > > > > unrecoverable data loss scenarios.
> > > > > > >> > > > > >> > > > > >
> > > > > > >> > > > > >> > > > > > Ismael
> > > > > > >> > > > > >> > > > > >
> > > > > > >> > > > > >> > > > > > On Wed, Jul 13, 2022 at 3:47 PM David
> Jacot
> > > > > > >> > > > > >> > > <dja...@confluent.io.invalid
> > > > > > >> > > > > >> > > > > >
> > > > > > >> > > > > >> > > > > > wrote:
> > > > > > >> > > > > >> > > > > >
> > > > > > >> > > > > >> > > > > > > Thanks Guozhang. My answers are below:
> > > > > > >> > > > > >> > > > > > >
> > > > > > >> > > > > >> > > > > > > > 1) the migration path, especially
> the last
> > > > step
> > > > > > >> when
> > > > > > >> > > > > clients
> > > > > > >> > > > > >> > > flip the
> > > > > > >> > > > > >> > > > > > > flag
> > > > > > >> > > > > >> > > > > > > > to enable the new protocol, in which
> we
> > > > would
> > > > > > >> have a
> > > > > > >> > > > > window
> > > > > > >> > > > > >> > where
> > > > > > >> > > > > >> > > > > both
> > > > > > >> > > > > >> > > > > > > new
> > > > > > >> > > > > >> > > > > > > > protocols / rpcs and old protocols /
> rpcs
> > > > are
> > > > > > >> used
> > > > > > >> > by
> > > > > > >> > > > > >> members
> > > > > > >> > > > > >> > of
> > > > > > >> > > > > >> > > the
> > > > > > >> > > > > >> > > > > same
> > > > > > >> > > > > >> > > > > > > > group. How the coordinator could
> "mimic"
> > > > the old
> > > > > > >> > behavior
> > > > > > >> > > > > >> while
> > > > > > >> > > > > >> > > > > using the
> > > > > > >> > > > > >> > > > > > > > new protocol is something we need to
> > > > present
> > > > > > >> about.
> > > > > > >> > > > > >> > > > > > >
> > > > > > >> > > > > >> > > > > > > Noted. I just published a new version
> of KIP
> > > > which
> > > > > > >> > includes
> > > > > > >> > > > > >> more
> > > > > > >> > > > > >> > > > > > > details about this. See the "Supporting
> > > > Online
> > > > > > >> > Consumer
> > > > > > >> > > > > Group
> > > > > > >> > > > > >> > > Upgrade"
> > > > > > >> > > > > >> > > > > > > and the "Compatibility, Deprecation,
> and
> > > > Migration
> > > > > > >> > Plan". I
> > > > > > >> > > > > >> think
> > > > > > >> > > > > >> > > that
> > > > > > >> > > > > >> > > > > > > I have to think through a few cases
> now but
> > > > the
> > > > > > >> > overall idea
> > > > > > >> > > > > >> and
> > > > > > >> > > > > >> > > > > > > mechanism should be understandable.
> > > > > > >> > > > > >> > > > > > >
> > > > > > >> > > > > >> > > > > > > > 2) the usage of topic ids. So far as
> > > > KIP-516 the
> > > > > > >> > topic ids
> > > > > > >> > > > > >> are
> > > > > > >> > > > > >> > > only
> > > > > > >> > > > > >> > > > > used
> > > > > > >> > > > > >> > > > > > > as
> > > > > > >> > > > > >> > > > > > > > part of RPCs and admin client, but
> they
> > > > are not
> > > > > > >> > exposed
> > > > > > >> > > > > via
> > > > > > >> > > > > >> any
> > > > > > >> > > > > >> > > > > public
> > > > > > >> > > > > >> > > > > > > APIs
> > > > > > >> > > > > >> > > > > > > > to consumers yet. I think the
> question is,
> > > > first
> > > > > > >> > should we
> > > > > > >> > > > > >> let
> > > > > > >> > > > > >> > > the
> > > > > > >> > > > > >> > > > > > > consumer
> > > > > > >> > > > > >> > > > > > > > client to be maintaining the names
> -> ids
> > > > mapping
> > > > > > >> > itself
> > > > > > >> > > > > to
> > > > > > >> > > > > >> > fully
> > > > > > >> > > > > >> > > > > > > leverage
> > > > > > >> > > > > >> > > > > > > > on all the augmented existing RPCs
> and the
> > > > new
> > > > > > >> RPCs
> > > > > > >> > with
> > > > > > >> > > > > the
> > > > > > >> > > > > >> > > topic
> > > > > > >> > > > > >> > > > > ids;
> > > > > > >> > > > > >> > > > > > > and
> > > > > > >> > > > > >> > > > > > > > secondly, should we ever consider
> exposing
> > > > the
> > > > > > >> > topic ids
> > > > > > >> > > > > in
> > > > > > >> > > > > >> the
> > > > > > >> > > > > >> > > > > consumer
> > > > > > >> > > > > >> > > > > > > > public APIs as well (both
> > > > subscribe/assign, as
> > > > > > >> well
> > > > > > >> > as in
> > > > > > >> > > > > >> the
> > > > > > >> > > > > >> > > > > rebalance
> > > > > > >> > > > > >> > > > > > > > listener for cases like topic
> > > > > > >> > deletion-and-recreation).
> > > > > > >> > > > > >> > > > > > >
> > > > > > >> > > > > >> > > > > > > a) Assuming that we would include
> converting
> > > > all
> > > > > > >> the
> > > > > > >> > offsets
> > > > > > >> > > > > >> > > related
> > > > > > >> > > > > >> > > > > > > RPCs to using topic ids in this KIP,
> the
> > > > consumer
> > > > > > >> > would be
> > > > > > >> > > > > >> able
> > > > > > >> > > > > >> > to
> > > > > > >> > > > > >> > > > > > > fully operate with topic ids. That
> being
> > > > said, it
> > > > > > >> > still has
> > > > > > >> > > > > to
> > > > > > >> > > > > >> > > provide
> > > > > > >> > > > > >> > > > > > > the topics names in various APIs so
> having a
> > > > > > >> mapping
> > > > > > >> > in the
> > > > > > >> > > > > >> > > consumer
> > > > > > >> > > > > >> > > > > > > seems inevitable to me.
> > > > > > >> > > > > >> > > > > > > b) I don't have a strong opinion on
> this.
> > > > Here I
> > > > > > >> > wonder if
> > > > > > >> > > > > >> this
> > > > > > >> > > > > >> > > goes
> > > > > > >> > > > > >> > > > > > > beyond the scope of this KIP. I would
> rather
> > > > focus
> > > > > > >> on
> > > > > > >> > the
> > > > > > >> > > > > >> > internals
> > > > > > >> > > > > >> > > > > > > here and we can consider this
> separately if
> > > > we see
> > > > > > >> > value in
> > > > > > >> > > > > >> doing
> > > > > > >> > > > > >> > > it.
> > > > > > >> > > > > >> > > > > > >
> > > > > > >> > > > > >> > > > > > > Coming back to Ismael's point about
> using
> > > > topic ids
> > > > > > >> > in the
> > > > > > >> > > > > >> > > > > > > ConsumerGroupHeartbeatRequest, I think
> that
> > > > there
> > > > > > >> is
> > > > > > >> > one
> > > > > > >> > > > > >> > advantage
> > > > > > >> > > > > >> > > in
> > > > > > >> > > > > >> > > > > > > favour of it. The consumer will have
> the
> > > > > > >> opportunity
> > > > > > >> > to
> > > > > > >> > > > > >> validate
> > > > > > >> > > > > >> > > that
> > > > > > >> > > > > >> > > > > > > the topics exists before passing them
> into
> > > > the
> > > > > > >> group
> > > > > > >> > > > > rebalance
> > > > > > >> > > > > >> > > > > > > protocol. Obviously, the coordinator
> will
> > > > also
> > > > > > >> notice
> > > > > > >> > it but
> > > > > > >> > > > > >> it
> > > > > > >> > > > > >> > > does
> > > > > > >> > > > > >> > > > > > > not really have a way to reject an
> invalid
> > > > topic in
> > > > > > >> > the
> > > > > > >> > > > > >> response.
> > > > > > >> > > > > >> > > > > > >
> > > > > > >> > > > > >> > > > > > > > I'm agreeing with David on all other
> minor
> > > > > > >> questions
> > > > > > >> > > > > except
> > > > > > >> > > > > >> for
> > > > > > >> > > > > >> > > the
> > > > > > >> > > > > >> > > > > > > > `subscribe(Pattern)` question:
> personally
> > > > I think
> > > > > > >> > it's not
> > > > > > >> > > > > >> > > necessary
> > > > > > >> > > > > >> > > > > to
> > > > > > >> > > > > >> > > > > > > > deprecate the subscribe API with
> Pattern,
> > > > but
> > > > > > >> > instead we
> > > > > > >> > > > > >> still
> > > > > > >> > > > > >> > > use
> > > > > > >> > > > > >> > > > > > > Pattern
> > > > > > >> > > > > >> > > > > > > > while just documenting that our
> > > > subscription may
> > > > > > >> be
> > > > > > >> > > > > >> rejected by
> > > > > > >> > > > > >> > > the
> > > > > > >> > > > > >> > > > > > > server.
> > > > > > >> > > > > >> > > > > > > > Since the incompatible case is a
> very rare
> > > > > > >> scenario
> > > > > > >> > I felt
> > > > > > >> > > > > >> > using
> > > > > > >> > > > > >> > > an
> > > > > > >> > > > > >> > > > > > > > overloaded `String` based
> subscription may
> > > > be
> > > > > > >> more
> > > > > > >> > > > > >> vulnerable
> > > > > > >> > > > > >> > to
> > > > > > >> > > > > >> > > > > various
> > > > > > >> > > > > >> > > > > > > > invalid regexes.
> > > > > > >> > > > > >> > > > > > >
> > > > > > >> > > > > >> > > > > > > That could work. I have to look at the
> > > > differences
> > > > > > >> > between
> > > > > > >> > > > > the
> > > > > > >> > > > > >> > two
> > > > > > >> > > > > >> > > > > > > engines to better understand the
> potential
> > > > issues.
> > > > > > >> My
> > > > > > >> > > > > >> > > understanding is
> > > > > > >> > > > > >> > > > > > > that would work for all the basic
> regular
> > > > > > >> > expressions. The
> > > > > > >> > > > > >> > > differences
> > > > > > >> > > > > >> > > > > > > between the two are mainly about the
> various
> > > > > > >> character
> > > > > > >> > > > > >> classes. I
> > > > > > >> > > > > >> > > > > > > wonder what other people think about
> this.
> > > > > > >> > > > > >> > > > > > >
> > > > > > >> > > > > >> > > > > > > Best,
> > > > > > >> > > > > >> > > > > > > David
> > > > > > >> > > > > >> > > > > > >
> > > > > > >> > > > > >> > > > > > > On Tue, Jul 12, 2022 at 11:28 PM
> Guozhang
> > > > Wang <
> > > > > > >> > > > > >> > wangg...@gmail.com
> > > > > > >> > > > > >> > > >
> > > > > > >> > > > > >> > > > > wrote:
> > > > > > >> > > > > >> > > > > > > >
> > > > > > >> > > > > >> > > > > > > > Thanks David! I think on the high
> level
> > > > there are
> > > > > > >> > two meta
> > > > > > >> > > > > >> > > points we
> > > > > > >> > > > > >> > > > > need
> > > > > > >> > > > > >> > > > > > > > to concretize a bit more:
> > > > > > >> > > > > >> > > > > > > >
> > > > > > >> > > > > >> > > > > > > > 1) the migration path, especially
> the last
> > > > step
> > > > > > >> when
> > > > > > >> > > > > clients
> > > > > > >> > > > > >> > > flip the
> > > > > > >> > > > > >> > > > > > > flag
> > > > > > >> > > > > >> > > > > > > > to enable the new protocol, in which
> we
> > > > would
> > > > > > >> have a
> > > > > > >> > > > > window
> > > > > > >> > > > > >> > where
> > > > > > >> > > > > >> > > > > both
> > > > > > >> > > > > >> > > > > > > new
> > > > > > >> > > > > >> > > > > > > > protocols / rpcs and old protocols /
> rpcs
> > > > are
> > > > > > >> used
> > > > > > >> > by
> > > > > > >> > > > > >> members
> > > > > > >> > > > > >> > of
> > > > > > >> > > > > >> > > the
> > > > > > >> > > > > >> > > > > same
> > > > > > >> > > > > >> > > > > > > > group. How the coordinator could
> "mimic"
> > > > the old
> > > > > > >> > behavior
> > > > > > >> > > > > >> while
> > > > > > >> > > > > >> > > > > using the
> > > > > > >> > > > > >> > > > > > > > new protocol is something we need to
> > > > present
> > > > > > >> about.
> > > > > > >> > > > > >> > > > > > > > 2) the usage of topic ids. So far as
> > > > KIP-516 the
> > > > > > >> > topic ids
> > > > > > >> > > > > >> are
> > > > > > >> > > > > >> > > only
> > > > > > >> > > > > >> > > > > used
> > > > > > >> > > > > >> > > > > > > as
> > > > > > >> > > > > >> > > > > > > > part of RPCs and admin client, but
> they
> > > > are not
> > > > > > >> > exposed
> > > > > > >> > > > > via
> > > > > > >> > > > > >> any
> > > > > > >> > > > > >> > > > > public
> > > > > > >> > > > > >> > > > > > > APIs
> > > > > > >> > > > > >> > > > > > > > to consumers yet. I think the
> question is,
> > > > first
> > > > > > >> > should we
> > > > > > >> > > > > >> let
> > > > > > >> > > > > >> > > the
> > > > > > >> > > > > >> > > > > > > consumer
> > > > > > >> > > > > >> > > > > > > > client to be maintaining the names
> -> ids
> > > > mapping
> > > > > > >> > itself
> > > > > > >> > > > > to
> > > > > > >> > > > > >> > fully
> > > > > > >> > > > > >> > > > > > > leverage
> > > > > > >> > > > > >> > > > > > > > on all the augmented existing RPCs
> and the
> > > > new
> > > > > > >> RPCs
> > > > > > >> > with
> > > > > > >> > > > > the
> > > > > > >> > > > > >> > > topic
> > > > > > >> > > > > >> > > > > ids;
> > > > > > >> > > > > >> > > > > > > and
> > > > > > >> > > > > >> > > > > > > > secondly, should we ever consider
> exposing
> > > > the
> > > > > > >> > topic ids
> > > > > > >> > > > > in
> > > > > > >> > > > > >> the
> > > > > > >> > > > > >> > > > > consumer
> > > > > > >> > > > > >> > > > > > > > public APIs as well (both
> > > > subscribe/assign, as
> > > > > > >> well
> > > > > > >> > as in
> > > > > > >> > > > > >> the
> > > > > > >> > > > > >> > > > > rebalance
> > > > > > >> > > > > >> > > > > > > > listener for cases like topic
> > > > > > >> > deletion-and-recreation).
> > > > > > >> > > > > >> > > > > > > >
> > > > > > >> > > > > >> > > > > > > > I'm agreeing with David on all other
> minor
> > > > > > >> questions
> > > > > > >> > > > > except
> > > > > > >> > > > > >> for
> > > > > > >> > > > > >> > > the
> > > > > > >> > > > > >> > > > > > > > `subscribe(Pattern)` question:
> personally
> > > > I think
> > > > > > >> > it's not
> > > > > > >> > > > > >> > > necessary
> > > > > > >> > > > > >> > > > > to
> > > > > > >> > > > > >> > > > > > > > deprecate the subscribe API with
> Pattern,
> > > > but
> > > > > > >> > instead we
> > > > > > >> > > > > >> still
> > > > > > >> > > > > >> > > use
> > > > > > >> > > > > >> > > > > > > Pattern
> > > > > > >> > > > > >> > > > > > > > while just documenting that our
> > > > subscription may
> > > > > > >> be
> > > > > > >> > > > > >> rejected by
> > > > > > >> > > > > >> > > the
> > > > > > >> > > > > >> > > > > > > server.
> > > > > > >> > > > > >> > > > > > > > Since the incompatible case is a
> very rare
> > > > > > >> scenario
> > > > > > >> > I felt
> > > > > > >> > > > > >> > using
> > > > > > >> > > > > >> > > an
> > > > > > >> > > > > >> > > > > > > > overloaded `String` based
> subscription may
> > > > be
> > > > > > >> more
> > > > > > >> > > > > >> vulnerable
> > > > > > >> > > > > >> > to
> > > > > > >> > > > > >> > > > > various
> > > > > > >> > > > > >> > > > > > > > invalid regexes.
> > > > > > >> > > > > >> > > > > > > >
> > > > > > >> > > > > >> > > > > > > >
> > > > > > >> > > > > >> > > > > > > > Guozhang
> > > > > > >> > > > > >> > > > > > > >
> > > > > > >> > > > > >> > > > > > > > On Tue, Jul 12, 2022 at 5:23 AM
> David Jacot
> > > > > > >> > > > > >> > > > > <dja...@confluent.io.invalid
> > > > > > >> > > > > >> > > > > > > >
> > > > > > >> > > > > >> > > > > > > > wrote:
> > > > > > >> > > > > >> > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > Hi Ismael,
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > Thanks for your feedback. Let me
> answer
> > > > your
> > > > > > >> > questions
> > > > > > >> > > > > >> > inline.
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > 1. I think it's premature to
> talk about
> > > > > > >> target
> > > > > > >> > > > > versions
> > > > > > >> > > > > >> for
> > > > > > >> > > > > >> > > > > > > deprecation
> > > > > > >> > > > > >> > > > > > > > > and
> > > > > > >> > > > > >> > > > > > > > > > removal of the existing group
> protocol.
> > > > > > >> Unlike
> > > > > > >> > KRaft,
> > > > > > >> > > > > >> this
> > > > > > >> > > > > >> > > > > affects a
> > > > > > >> > > > > >> > > > > > > core
> > > > > > >> > > > > >> > > > > > > > > > client protocol and hence
> > > > deprecation/removal
> > > > > > >> > will be
> > > > > > >> > > > > >> > heavily
> > > > > > >> > > > > >> > > > > > > dependent
> > > > > > >> > > > > >> > > > > > > > > on
> > > > > > >> > > > > >> > > > > > > > > > how quickly applications migrate
> to
> > > > the new
> > > > > > >> > protocol.
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > That makes sense. I will remove it.
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > 2. The KIP says we intend to
> release
> > > > this in
> > > > > > >> > 4.x, but
> > > > > > >> > > > > it
> > > > > > >> > > > > >> > > wasn't
> > > > > > >> > > > > >> > > > > made
> > > > > > >> > > > > >> > > > > > > > > clear
> > > > > > >> > > > > >> > > > > > > > > > why. If we added that as a way to
> > > > estimate
> > > > > > >> when
> > > > > > >> > we'd
> > > > > > >> > > > > >> > > deprecate
> > > > > > >> > > > > >> > > > > and
> > > > > > >> > > > > >> > > > > > > remove
> > > > > > >> > > > > >> > > > > > > > > > the group protocol, I also
> suggest
> > > > removing
> > > > > > >> > this part.
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > Let me explain my reasoning. As
> > > > explained, I
> > > > > > >> plan
> > > > > > >> > to
> > > > > > >> > > > > >> rewrite
> > > > > > >> > > > > >> > > the
> > > > > > >> > > > > >> > > > > group
> > > > > > >> > > > > >> > > > > > > > > coordinator in Java while we
> implement
> > > > the new
> > > > > > >> > protocol.
> > > > > > >> > > > > >> This
> > > > > > >> > > > > >> > > means
> > > > > > >> > > > > >> > > > > > > > > that the internals will be slightly
> > > > different
> > > > > > >> > (e.g.
> > > > > > >> > > > > >> threading
> > > > > > >> > > > > >> > > > > model).
> > > > > > >> > > > > >> > > > > > > > > Therefore, I wanted to tighten the
> > > > switch from
> > > > > > >> > the old
> > > > > > >> > > > > >> group
> > > > > > >> > > > > >> > > > > > > > > coordinator to the new group
> coordinator
> > > > to a
> > > > > > >> > major
> > > > > > >> > > > > >> release.
> > > > > > >> > > > > >> > > The
> > > > > > >> > > > > >> > > > > > > > > alternative would be to use a flag
> to do
> > > > the
> > > > > > >> > switch
> > > > > > >> > > > > >> instead
> > > > > > >> > > > > >> > of
> > > > > > >> > > > > >> > > > > relying
> > > > > > >> > > > > >> > > > > > > > > on the software upgrade.
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > 3. We need to flesh out the
> details of
> > > > the
> > > > > > >> > migration
> > > > > > >> > > > > >> story.
> > > > > > >> > > > > >> > > It
> > > > > > >> > > > > >> > > > > sounds
> > > > > > >> > > > > >> > > > > > > > > like
> > > > > > >> > > > > >> > > > > > > > > > we're saying we will support
> online
> > > > > > >> migrations.
> > > > > > >> > Is
> > > > > > >> > > > > that
> > > > > > >> > > > > >> > > correct?
> > > > > > >> > > > > >> > > > > We
> > > > > > >> > > > > >> > > > > > > > > should
> > > > > > >> > > > > >> > > > > > > > > > explain this in detail. It could
> also
> > > > be done
> > > > > > >> > as a
> > > > > > >> > > > > >> separate
> > > > > > >> > > > > >> > > KIP,
> > > > > > >> > > > > >> > > > > if
> > > > > > >> > > > > >> > > > > > > it's
> > > > > > >> > > > > >> > > > > > > > > > easier.
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > Yes, we will support online
> migrations
> > > > for the
> > > > > > >> > group.
> > > > > > >> > > > > That
> > > > > > >> > > > > >> > > means
> > > > > > >> > > > > >> > > > > that
> > > > > > >> > > > > >> > > > > > > > > a group using the old protocol
> will be
> > > > able to
> > > > > > >> > switch to
> > > > > > >> > > > > >> the
> > > > > > >> > > > > >> > > new
> > > > > > >> > > > > >> > > > > > > > > protocol.
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > Let me briefly explain how that
> will work
> > > > > > >> though.
> > > > > > >> > It is
> > > > > > >> > > > > >> > > basically a
> > > > > > >> > > > > >> > > > > > > > > four step process:
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > 1. The cluster must be upgraded or
> > > > rolled to a
> > > > > > >> > software
> > > > > > >> > > > > >> > > supporting
> > > > > > >> > > > > >> > > > > the
> > > > > > >> > > > > >> > > > > > > > > new group coordinator. Both the
> old and
> > > > the new
> > > > > > >> > > > > >> coordinator
> > > > > > >> > > > > >> > > will
> > > > > > >> > > > > >> > > > > > > > > support the old protocol and rely
> on the
> > > > same
> > > > > > >> > persisted
> > > > > > >> > > > > >> > > metadata so
> > > > > > >> > > > > >> > > > > > > > > they can work together. This point
> is an
> > > > > > >> offline
> > > > > > >> > > > > >> migration.
> > > > > > >> > > > > >> > We
> > > > > > >> > > > > >> > > > > cannot
> > > > > > >> > > > > >> > > > > > > > > do this one live because it would
> require
> > > > > > >> > shutting down
> > > > > > >> > > > > >> the
> > > > > > >> > > > > >> > > current
> > > > > > >> > > > > >> > > > > > > > > coordinator and starting up the
> new one
> > > > and
> > > > > > >> that
> > > > > > >> > would
> > > > > > >> > > > > >> cause
> > > > > > >> > > > > >> > > > > > > > > unavailabilities.
> > > > > > >> > > > > >> > > > > > > > > 2. The cluster's metadata
> version/IBP
> > > > must be
> > > > > > >> > upgraded
> > > > > > >> > > > > to
> > > > > > >> > > > > >> X
> > > > > > >> > > > > >> > in
> > > > > > >> > > > > >> > > > > order
> > > > > > >> > > > > >> > > > > > > > > to enable the new protocol. This
> cannot
> > > > be done
> > > > > > >> > before
> > > > > > >> > > > > 1)
> > > > > > >> > > > > >> is
> > > > > > >> > > > > >> > > > > > > > > terminated because the old
> coordinator
> > > > doesn't
> > > > > > >> > support
> > > > > > >> > > > > the
> > > > > > >> > > > > >> > new
> > > > > > >> > > > > >> > > > > > > > > protocol.
> > > > > > >> > > > > >> > > > > > > > > 3. The consumers must be upgraded
> to a
> > > > version
> > > > > > >> > > > > supporting
> > > > > > >> > > > > >> the
> > > > > > >> > > > > >> > > > > online
> > > > > > >> > > > > >> > > > > > > > > migration (must have KIP-792). If
> the
> > > > consumer
> > > > > > >> is
> > > > > > >> > > > > already
> > > > > > >> > > > > >> > > there.
> > > > > > >> > > > > >> > > > > > > > > Nothing must be done at this point.
> > > > > > >> > > > > >> > > > > > > > > 4. The consumers must be rolled
> with the
> > > > > > >> feature
> > > > > > >> > flag
> > > > > > >> > > > > >> turned
> > > > > > >> > > > > >> > > on.
> > > > > > >> > > > > >> > > > > The
> > > > > > >> > > > > >> > > > > > > > > consumer group is automatically
> > > > converted when
> > > > > > >> > the first
> > > > > > >> > > > > >> > > consumer
> > > > > > >> > > > > >> > > > > > > > > using the new protocol joins the
> group.
> > > > While
> > > > > > >> the
> > > > > > >> > > > > members
> > > > > > >> > > > > >> > > using the
> > > > > > >> > > > > >> > > > > > > > > old protocol are being upgraded,
> the old
> > > > > > >> protocol
> > > > > > >> > is
> > > > > > >> > > > > >> proxied
> > > > > > >> > > > > >> > > into
> > > > > > >> > > > > >> > > > > the
> > > > > > >> > > > > >> > > > > > > > > new one.
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > Let me clarify all of this in the
> KIP.
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > 4. I am happy that we are
> pushing the
> > > > pattern
> > > > > > >> > > > > >> subscriptions
> > > > > > >> > > > > >> > > to
> > > > > > >> > > > > >> > > > > the
> > > > > > >> > > > > >> > > > > > > > > server,
> > > > > > >> > > > > >> > > > > > > > > > but it seems like there could be
> some
> > > > tricky
> > > > > > >> > > > > >> compatibility
> > > > > > >> > > > > >> > > > > issues.
> > > > > > >> > > > > >> > > > > > > Will
> > > > > > >> > > > > >> > > > > > > > > we
> > > > > > >> > > > > >> > > > > > > > > > have a mechanism for users to
> detect
> > > > that
> > > > > > >> they
> > > > > > >> > need to
> > > > > > >> > > > > >> > update
> > > > > > >> > > > > >> > > > > their
> > > > > > >> > > > > >> > > > > > > regex
> > > > > > >> > > > > >> > > > > > > > > > before switching to the new
> protocol?
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > I think that I am a bit more
> optimistic
> > > > than
> > > > > > >> you
> > > > > > >> > on this
> > > > > > >> > > > > >> > > point. I
> > > > > > >> > > > > >> > > > > > > > > believe that the majority of the
> cases
> > > > are
> > > > > > >> simple
> > > > > > >> > > > > regexes
> > > > > > >> > > > > >> > which
> > > > > > >> > > > > >> > > > > should
> > > > > > >> > > > > >> > > > > > > > > work with the new engine. The
> > > > coordinator will
> > > > > > >> > verify
> > > > > > >> > > > > the
> > > > > > >> > > > > >> > regex
> > > > > > >> > > > > >> > > > > anyway
> > > > > > >> > > > > >> > > > > > > > > and reject the consumer if the
> regex is
> > > > not
> > > > > > >> valid.
> > > > > > >> > > > > Coming
> > > > > > >> > > > > >> > back
> > > > > > >> > > > > >> > > to
> > > > > > >> > > > > >> > > > > the
> > > > > > >> > > > > >> > > > > > > > > migration path, in the worst case,
> the
> > > > first
> > > > > > >> > upgraded
> > > > > > >> > > > > >> > consumer
> > > > > > >> > > > > >> > > > > joining
> > > > > > >> > > > > >> > > > > > > > > the group will be rejected. This
> should
> > > > be used
> > > > > > >> > as the
> > > > > > >> > > > > >> last
> > > > > > >> > > > > >> > > > > defence, I
> > > > > > >> > > > > >> > > > > > > > > would say.
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > One way for customers to validate
> their
> > > > regex
> > > > > > >> > before
> > > > > > >> > > > > >> > upgrading
> > > > > > >> > > > > >> > > > > their
> > > > > > >> > > > > >> > > > > > > > > prod would be to test them with
> another
> > > > group.
> > > > > > >> For
> > > > > > >> > > > > >> instance,
> > > > > > >> > > > > >> > > that
> > > > > > >> > > > > >> > > > > > > > > could be done in a pre-prod
> environment.
> > > > > > >> Another
> > > > > > >> > way
> > > > > > >> > > > > >> would be
> > > > > > >> > > > > >> > > to
> > > > > > >> > > > > >> > > > > > > > > extend the consumer-group tool to
> > > > provide a
> > > > > > >> regex
> > > > > > >> > > > > >> validation
> > > > > > >> > > > > >> > > > > > > > > mechanism. Would this be enough in
> your
> > > > > > >> opinion?
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > 5. Related to the last question,
> will
> > > > the
> > > > > > >> Java
> > > > > > >> > client
> > > > > > >> > > > > >> allow
> > > > > > >> > > > > >> > > the
> > > > > > >> > > > > >> > > > > > > users to
> > > > > > >> > > > > >> > > > > > > > > > stick with the current regex
> engine for
> > > > > > >> > compatibility
> > > > > > >> > > > > >> > > reasons?
> > > > > > >> > > > > >> > > > > For
> > > > > > >> > > > > >> > > > > > > > > example,
> > > > > > >> > > > > >> > > > > > > > > > it may be handy to keep using
> client
> > > > based
> > > > > > >> > regex at
> > > > > > >> > > > > >> first
> > > > > > >> > > > > >> > to
> > > > > > >> > > > > >> > > keep
> > > > > > >> > > > > >> > > > > > > > > > migrations simple and then
> migrate to
> > > > server
> > > > > > >> > based
> > > > > > >> > > > > >> regexes
> > > > > > >> > > > > >> > > as a
> > > > > > >> > > > > >> > > > > > > second
> > > > > > >> > > > > >> > > > > > > > > step.
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > I understand your point but I am
> > > > concerned that
> > > > > > >> > this
> > > > > > >> > > > > would
> > > > > > >> > > > > >> > > allow
> > > > > > >> > > > > >> > > > > users
> > > > > > >> > > > > >> > > > > > > > > to actually stay in this mode. That
> > > > would go
> > > > > > >> > against our
> > > > > > >> > > > > >> goal
> > > > > > >> > > > > >> > > of
> > > > > > >> > > > > >> > > > > > > > > simplifying the client because we
> would
> > > > have to
> > > > > > >> > continue
> > > > > > >> > > > > >> > > monitoring
> > > > > > >> > > > > >> > > > > > > > > the metadata on the client side. I
> would
> > > > rather
> > > > > > >> > not do
> > > > > > >> > > > > >> this.
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > 6. When we say that the group
> > > > coordinator
> > > > > > >> will
> > > > > > >> > be
> > > > > > >> > > > > >> > > responsible for
> > > > > > >> > > > > >> > > > > > > storing
> > > > > > >> > > > > >> > > > > > > > > > the configurations and that the
> > > > > > >> configurations
> > > > > > >> > will be
> > > > > > >> > > > > >> > > deleted
> > > > > > >> > > > > >> > > > > when
> > > > > > >> > > > > >> > > > > > > the
> > > > > > >> > > > > >> > > > > > > > > > group is deleted. Will a
> transition to
> > > > DEAD
> > > > > > >> > trigger
> > > > > > >> > > > > >> > deletion
> > > > > > >> > > > > >> > > of
> > > > > > >> > > > > >> > > > > > > > > > configurations?
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > That's right. The configurations
> will be
> > > > > > >> deleted
> > > > > > >> > when
> > > > > > >> > > > > the
> > > > > > >> > > > > >> > > group is
> > > > > > >> > > > > >> > > > > > > > > deleted. They go together.
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > 7. Will the choice to store the
> > > > configs in
> > > > > > >> the
> > > > > > >> > group
> > > > > > >> > > > > >> > > coordinator
> > > > > > >> > > > > >> > > > > > > make it
> > > > > > >> > > > > >> > > > > > > > > > harder to list all cluster
> configs and
> > > > their
> > > > > > >> > values?
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > I don't think so. The group
> > > > configurations are
> > > > > > >> > overrides
> > > > > > >> > > > > >> of
> > > > > > >> > > > > >> > > cluster
> > > > > > >> > > > > >> > > > > > > > > configs. If you want to know all
> the
> > > > overrides
> > > > > > >> > though,
> > > > > > >> > > > > you
> > > > > > >> > > > > >> > > would
> > > > > > >> > > > > >> > > > > have
> > > > > > >> > > > > >> > > > > > > > > to ask all the group coordinators.
> You
> > > > cannot
> > > > > > >> > rely on
> > > > > > >> > > > > the
> > > > > > >> > > > > >> > > metadata
> > > > > > >> > > > > >> > > > > log
> > > > > > >> > > > > >> > > > > > > > > for instance.
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > 8. How would someone configure a
> group
> > > > before
> > > > > > >> > starting
> > > > > > >> > > > > >> the
> > > > > > >> > > > > >> > > > > consumers?
> > > > > > >> > > > > >> > > > > > > > > Have
> > > > > > >> > > > > >> > > > > > > > > > we considered allowing the
> explicit
> > > > creation
> > > > > > >> of
> > > > > > >> > > > > groups?
> > > > > > >> > > > > >> > > > > > > Alternatively,
> > > > > > >> > > > > >> > > > > > > > > the
> > > > > > >> > > > > >> > > > > > > > > > configs could be decoupled from
> the
> > > > group
> > > > > > >> > lifecycle.
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > Yes. The group will be
> automatically
> > > > created in
> > > > > > >> > this
> > > > > > >> > > > > case.
> > > > > > >> > > > > >> > > However,
> > > > > > >> > > > > >> > > > > > > > > the configs will be lost after the
> > > > retention
> > > > > > >> > period of
> > > > > > >> > > > > the
> > > > > > >> > > > > >> > > group
> > > > > > >> > > > > >> > > > > > > > > passes.
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > 9. Will the Consumer.subscribe
> method
> > > > for the
> > > > > > >> > Java
> > > > > > >> > > > > >> client
> > > > > > >> > > > > >> > > still
> > > > > > >> > > > > >> > > > > take
> > > > > > >> > > > > >> > > > > > > a
> > > > > > >> > > > > >> > > > > > > > > > `java.util.regex.Pattern` of do
> we
> > > > have to
> > > > > > >> > introduce
> > > > > > >> > > > > an
> > > > > > >> > > > > >> > > overload?
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > That's a very group question. I
> forgot
> > > > about
> > > > > > >> that
> > > > > > >> > one.
> > > > > > >> > > > > As
> > > > > > >> > > > > >> the
> > > > > > >> > > > > >> > > > > > > > > `java.util.regex.Pattern` is not
> fully
> > > > > > >> compatible
> > > > > > >> > with
> > > > > > >> > > > > the
> > > > > > >> > > > > >> > > engine
> > > > > > >> > > > > >> > > > > that
> > > > > > >> > > > > >> > > > > > > > > we plan to use, it might be better
> to
> > > > deprecate
> > > > > > >> > it and
> > > > > > >> > > > > >> use an
> > > > > > >> > > > > >> > > > > overload
> > > > > > >> > > > > >> > > > > > > > > which takes a string. We would
> rely on
> > > > the
> > > > > > >> server
> > > > > > >> > side
> > > > > > >> > > > > >> > > validation.
> > > > > > >> > > > > >> > > > > > > > > During the migration, I think that
> we
> > > > could
> > > > > > >> still
> > > > > > >> > try to
> > > > > > >> > > > > >> > > toString
> > > > > > >> > > > > >> > > > > the
> > > > > > >> > > > > >> > > > > > > > > regex and use it. That should
> work, I
> > > > think, in
> > > > > > >> > the
> > > > > > >> > > > > >> majority
> > > > > > >> > > > > >> > > of the
> > > > > > >> > > > > >> > > > > > > > > cases.
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > 10. I agree with Justine that we
> > > > should be
> > > > > > >> > clearer
> > > > > > >> > > > > about
> > > > > > >> > > > > >> > the
> > > > > > >> > > > > >> > > > > reason
> > > > > > >> > > > > >> > > > > > > to
> > > > > > >> > > > > >> > > > > > > > > > switch to IBP/metadata.version
> from the
> > > > > > >> feature
> > > > > > >> > flag.
> > > > > > >> > > > > >> Maybe
> > > > > > >> > > > > >> > > we
> > > > > > >> > > > > >> > > > > mean
> > > > > > >> > > > > >> > > > > > > that
> > > > > > >> > > > > >> > > > > > > > > we
> > > > > > >> > > > > >> > > > > > > > > > can switch the default for the
> feature
> > > > flag
> > > > > > >> to
> > > > > > >> > true
> > > > > > >> > > > > >> based
> > > > > > >> > > > > >> > on
> > > > > > >> > > > > >> > > the
> > > > > > >> > > > > >> > > > > > > > > > metadata.version once we want to
> make
> > > > it the
> > > > > > >> > default.
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > My plan was to use that feature
> flag
> > > > mainly
> > > > > > >> > during the
> > > > > > >> > > > > >> > > development
> > > > > > >> > > > > >> > > > > > > > > phase. I should not have mentioned
> it, I
> > > > think,
> > > > > > >> > because
> > > > > > >> > > > > we
> > > > > > >> > > > > >> > > could
> > > > > > >> > > > > >> > > > > use
> > > > > > >> > > > > >> > > > > > > > > an internal config for it.
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > 11. Some of the protocol APIs
> don't
> > > > mention
> > > > > > >> the
> > > > > > >> > > > > required
> > > > > > >> > > > > >> > > ACLs, it
> > > > > > >> > > > > >> > > > > > > would
> > > > > > >> > > > > >> > > > > > > > > be
> > > > > > >> > > > > >> > > > > > > > > > good to add that for consistency.
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > Noted.
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > 12. It is a bit odd that
> > > > > > >> ConsumerGroupHeartbeat
> > > > > > >> > > > > requires
> > > > > > >> > > > > >> > > "Read
> > > > > > >> > > > > >> > > > > Group"
> > > > > > >> > > > > >> > > > > > > > > even
> > > > > > >> > > > > >> > > > > > > > > > though it seems to do more than
> > > > reading.
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > I agree. This is how the current
> > > > protocol works
> > > > > > >> > though.
> > > > > > >> > > > > We
> > > > > > >> > > > > >> > only
> > > > > > >> > > > > >> > > > > > > > > require "Read Group" to join a
> group. We
> > > > could
> > > > > > >> > consider
> > > > > > >> > > > > >> > > changing
> > > > > > >> > > > > >> > > > > this
> > > > > > >> > > > > >> > > > > > > > > but I am not sure that it is worth
> it.
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > 13. How is topic recreation
> handled by
> > > > the
> > > > > > >> > consumer
> > > > > > >> > > > > with
> > > > > > >> > > > > >> > the
> > > > > > >> > > > > >> > > new
> > > > > > >> > > > > >> > > > > > > group
> > > > > > >> > > > > >> > > > > > > > > > protocol? It would be good to
> have a
> > > > section
> > > > > > >> on
> > > > > > >> > this.
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > Noted. From a protocol
> perspective, the
> > > > new
> > > > > > >> topic
> > > > > > >> > will
> > > > > > >> > > > > >> have a
> > > > > > >> > > > > >> > > new
> > > > > > >> > > > > >> > > > > > > > > topic id so it will treat it like a
> > > > topic with
> > > > > > >> a
> > > > > > >> > > > > different
> > > > > > >> > > > > >> > > name.
> > > > > > >> > > > > >> > > > > The
> > > > > > >> > > > > >> > > > > > > > > only issue is that the fetch/commit
> > > > offsets
> > > > > > >> APIs
> > > > > > >> > do not
> > > > > > >> > > > > >> > support
> > > > > > >> > > > > >> > > > > topic
> > > > > > >> > > > > >> > > > > > > > > IDs so the consumer would reuse the
> > > > offsets
> > > > > > >> based
> > > > > > >> > on the
> > > > > > >> > > > > >> > same.
> > > > > > >> > > > > >> > > I
> > > > > > >> > > > > >> > > > > think
> > > > > > >> > > > > >> > > > > > > > > that we should update those APIs
> as well
> > > > in
> > > > > > >> order
> > > > > > >> > to be
> > > > > > >> > > > > >> > > consistent
> > > > > > >> > > > > >> > > > > end
> > > > > > >> > > > > >> > > > > > > > > to end. That would strengthen the
> > > > semantics of
> > > > > > >> the
> > > > > > >> > > > > >> consumer.
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > 14. The KIP mentions we will
> write the
> > > > new
> > > > > > >> > coordinator
> > > > > > >> > > > > >> in
> > > > > > >> > > > > >> > > Java.
> > > > > > >> > > > > >> > > > > Even
> > > > > > >> > > > > >> > > > > > > > > though
> > > > > > >> > > > > >> > > > > > > > > > this is an implementation
> detail, do
> > > > we plan
> > > > > > >> to
> > > > > > >> > have a
> > > > > > >> > > > > >> new
> > > > > > >> > > > > >> > > gradle
> > > > > > >> > > > > >> > > > > > > module
> > > > > > >> > > > > >> > > > > > > > > > for it?
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > Yes.
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > 15. Do we have a scalability
> goal when
> > > > it
> > > > > > >> comes
> > > > > > >> > to how
> > > > > > >> > > > > >> many
> > > > > > >> > > > > >> > > > > members
> > > > > > >> > > > > >> > > > > > > the
> > > > > > >> > > > > >> > > > > > > > > new
> > > > > > >> > > > > >> > > > > > > > > > group protocol can support?
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > We don't have numbers at the
> moment. The
> > > > > > >> protocol
> > > > > > >> > should
> > > > > > >> > > > > >> > > support
> > > > > > >> > > > > >> > > > > 1000s
> > > > > > >> > > > > >> > > > > > > > > of members per group. We will
> measure
> > > > this when
> > > > > > >> > we have
> > > > > > >> > > > > a
> > > > > > >> > > > > >> > first
> > > > > > >> > > > > >> > > > > > > > > implementation. Note that we might
> have
> > > > other
> > > > > > >> > > > > bottlenecks
> > > > > > >> > > > > >> > down
> > > > > > >> > > > > >> > > the
> > > > > > >> > > > > >> > > > > > > > > road (e.g. offset commits).
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > 16. Did we consider having
> > > > SubscribedTopidIds
> > > > > > >> > instead
> > > > > > >> > > > > of
> > > > > > >> > > > > >> > > > > > > > > > SubscribedTopicNames in
> > > > > > >> > ConsumerGroupHeartbeatRequest?
> > > > > > >> > > > > >> Is
> > > > > > >> > > > > >> > the
> > > > > > >> > > > > >> > > > > idea
> > > > > > >> > > > > >> > > > > > > that
> > > > > > >> > > > > >> > > > > > > > > > since we have to resolve the
> regex on
> > > > the
> > > > > > >> > server, we
> > > > > > >> > > > > >> can do
> > > > > > >> > > > > >> > > the
> > > > > > >> > > > > >> > > > > same
> > > > > > >> > > > > >> > > > > > > for
> > > > > > >> > > > > >> > > > > > > > > > the topic name? The difference
> is that
> > > > > > >> sending
> > > > > > >> > the
> > > > > > >> > > > > >> regex is
> > > > > > >> > > > > >> > > more
> > > > > > >> > > > > >> > > > > > > > > efficient
> > > > > > >> > > > > >> > > > > > > > > > whereas sending the topic names
> is less
> > > > > > >> > efficient.
> > > > > > >> > > > > >> > > Furthermore,
> > > > > > >> > > > > >> > > > > > > delete
> > > > > > >> > > > > >> > > > > > > > > and
> > > > > > >> > > > > >> > > > > > > > > > recreation is easier to handle
> if we
> > > > have
> > > > > > >> topic
> > > > > > >> > ids.
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > The idea was to consolidate the
> metadata
> > > > lookup
> > > > > > >> > on the
> > > > > > >> > > > > >> server
> > > > > > >> > > > > >> > > for
> > > > > > >> > > > > >> > > > > both
> > > > > > >> > > > > >> > > > > > > > > paths but I do agree with your
> point. As
> > > > a
> > > > > > >> second
> > > > > > >> > > > > though,
> > > > > > >> > > > > >> > using
> > > > > > >> > > > > >> > > > > topic
> > > > > > >> > > > > >> > > > > > > > > ids may be better here for the
> delete and
> > > > > > >> > recreation
> > > > > > >> > > > > case.
> > > > > > >> > > > > >> > > Also, I
> > > > > > >> > > > > >> > > > > > > > > suppose that we may allow users to
> > > > subscribe
> > > > > > >> with
> > > > > > >> > topic
> > > > > > >> > > > > >> ids
> > > > > > >> > > > > >> > in
> > > > > > >> > > > > >> > > the
> > > > > > >> > > > > >> > > > > > > > > future because that is the only
> way to be
> > > > > > >> really
> > > > > > >> > robust
> > > > > > >> > > > > to
> > > > > > >> > > > > >> > > topic
> > > > > > >> > > > > >> > > > > > > > > re-creation.
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > Best,
> > > > > > >> > > > > >> > > > > > > > > David
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > On Tue, Jul 12, 2022 at 1:38 PM
> David
> > > > Jacot <
> > > > > > >> > > > > >> > > dja...@confluent.io>
> > > > > > >> > > > > >> > > > > > > wrote:
> > > > > > >> > > > > >> > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > Hi Justine,
> > > > > > >> > > > > >> > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > Thanks for your comments. Please
> find
> > > > my
> > > > > > >> answers
> > > > > > >> > > > > below.
> > > > > > >> > > > > >> > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > - Yes, the new protocol relies on
> > > > topic IDs
> > > > > > >> > with the
> > > > > > >> > > > > >> > > exception
> > > > > > >> > > > > >> > > > > of the
> > > > > > >> > > > > >> > > > > > > > > > topic names based in the
> > > > > > >> > > > > ConsumerGroupHeartbeatRequest.
> > > > > > >> > > > > >> I
> > > > > > >> > > > > >> > am
> > > > > > >> > > > > >> > > not
> > > > > > >> > > > > >> > > > > sure
> > > > > > >> > > > > >> > > > > > > > > > if using topic names is the
> right call
> > > > here.
> > > > > > >> I
> > > > > > >> > need to
> > > > > > >> > > > > >> > think
> > > > > > >> > > > > >> > > > > about it
> > > > > > >> > > > > >> > > > > > > > > > a little more. Obviously, the
> KIP does
> > > > not
> > > > > > >> > change the
> > > > > > >> > > > > >> > > > > fetch/commit
> > > > > > >> > > > > >> > > > > > > > > > offsets RPCs to use topic IDs.
> This
> > > > may be
> > > > > > >> > something
> > > > > > >> > > > > >> that
> > > > > > >> > > > > >> > we
> > > > > > >> > > > > >> > > > > should
> > > > > > >> > > > > >> > > > > > > > > > include though as it would give
> better
> > > > > > >> overall
> > > > > > >> > > > > >> guarantee in
> > > > > > >> > > > > >> > > the
> > > > > > >> > > > > >> > > > > > > > > > producer.
> > > > > > >> > > > > >> > > > > > > > > > - You're right. I think that I
> should
> > > > not
> > > > > > >> have
> > > > > > >> > > > > mentioned
> > > > > > >> > > > > >> > this
> > > > > > >> > > > > >> > > > > flag at
> > > > > > >> > > > > >> > > > > > > > > > all. I will remove it. We can
> use an
> > > > internal
> > > > > > >> > > > > >> configuration
> > > > > > >> > > > > >> > > while
> > > > > > >> > > > > >> > > > > > > > > > developing the feature.
> > > > > > >> > > > > >> > > > > > > > > > - Both cluster types will be
> > > > supported. The
> > > > > > >> > change is
> > > > > > >> > > > > >> > > > > orthogonal. The
> > > > > > >> > > > > >> > > > > > > > > > only requirement is that the
> cluster
> > > > uses
> > > > > > >> topic
> > > > > > >> > IDs.
> > > > > > >> > > > > >> > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > Best,
> > > > > > >> > > > > >> > > > > > > > > > David
> > > > > > >> > > > > >> > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > On Mon, Jul 11, 2022 at 9:53 PM
> > > > Guozhang
> > > > > > >> Wang <
> > > > > > >> > > > > >> > > > > wangg...@gmail.com>
> > > > > > >> > > > > >> > > > > > > > > wrote:
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > Hi Ismael,
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > Thanks for the feedback. Here
> are
> > > > some
> > > > > > >> replies
> > > > > > >> > > > > inlined
> > > > > > >> > > > > >> > > below:
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > On Sat, Jul 9, 2022 at 2:53 AM
> > > > Ismael Juma
> > > > > > >> <
> > > > > > >> > > > > >> > > ism...@juma.me.uk>
> > > > > > >> > > > > >> > > > > > > wrote:
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > > Thanks for the KIP. This has
> the
> > > > > > >> potential
> > > > > > >> > to be a
> > > > > > >> > > > > >> > great
> > > > > > >> > > > > >> > > > > > > > > improvement. A few
> > > > > > >> > > > > >> > > > > > > > > > > > initial questions/comments:
> > > > > > >> > > > > >> > > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > > 1. I think it's premature to
> talk
> > > > about
> > > > > > >> > target
> > > > > > >> > > > > >> versions
> > > > > > >> > > > > >> > > for
> > > > > > >> > > > > >> > > > > > > > > deprecation and
> > > > > > >> > > > > >> > > > > > > > > > > > removal of the existing group
> > > > protocol.
> > > > > > >> > Unlike
> > > > > > >> > > > > >> KRaft,
> > > > > > >> > > > > >> > > this
> > > > > > >> > > > > >> > > > > > > affects a
> > > > > > >> > > > > >> > > > > > > > > core
> > > > > > >> > > > > >> > > > > > > > > > > > client protocol and hence
> > > > > > >> > deprecation/removal will
> > > > > > >> > > > > >> be
> > > > > > >> > > > > >> > > heavily
> > > > > > >> > > > > >> > > > > > > > > dependent on
> > > > > > >> > > > > >> > > > > > > > > > > > how quickly applications
> migrate
> > > > to the
> > > > > > >> new
> > > > > > >> > > > > >> protocol.
> > > > > > >> > > > > >> > > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > Yeah I agree with you. I think
> we can
> > > > > > >> remove
> > > > > > >> > the
> > > > > > >> > > > > >> proposed
> > > > > > >> > > > > >> > > > > timeline
> > > > > > >> > > > > >> > > > > > > in
> > > > > > >> > > > > >> > > > > > > > > the
> > > > > > >> > > > > >> > > > > > > > > > > `Compatibility, Deprecation,
> and
> > > > Migration
> > > > > > >> > Plan` and
> > > > > > >> > > > > >> > > instead
> > > > > > >> > > > > >> > > > > just
> > > > > > >> > > > > >> > > > > > > state
> > > > > > >> > > > > >> > > > > > > > > > > that we will decide in the
> future
> > > > about
> > > > > > >> when
> > > > > > >> > we
> > > > > > >> > > > > would
> > > > > > >> > > > > >> > > > > deprecate old
> > > > > > >> > > > > >> > > > > > > > > > > protocol and behaviors.
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > > 2. The KIP says we intend to
> > > > release this
> > > > > > >> > in 4.x,
> > > > > > >> > > > > >> but
> > > > > > >> > > > > >> > it
> > > > > > >> > > > > >> > > > > wasn't
> > > > > > >> > > > > >> > > > > > > made
> > > > > > >> > > > > >> > > > > > > > > clear
> > > > > > >> > > > > >> > > > > > > > > > > > why. If we added that as a
> way to
> > > > > > >> estimate
> > > > > > >> > when
> > > > > > >> > > > > we'd
> > > > > > >> > > > > >> > > > > deprecate
> > > > > > >> > > > > >> > > > > > > and
> > > > > > >> > > > > >> > > > > > > > > remove
> > > > > > >> > > > > >> > > > > > > > > > > > the group protocol, I also
> suggest
> > > > > > >> removing
> > > > > > >> > this
> > > > > > >> > > > > >> part.
> > > > > > >> > > > > >> > > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > I think that's not specifically
> > > > related to
> > > > > > >> the
> > > > > > >> > > > > >> > > > > deprecation/removal
> > > > > > >> > > > > >> > > > > > > > > timeline
> > > > > > >> > > > > >> > > > > > > > > > > plan, but it's more for client
> > > > upgrades.
> > > > > > >> I.e.
> > > > > > >> > the
> > > > > > >> > > > > >> > > broker-side
> > > > > > >> > > > > >> > > > > > > > > > > implementation may be done
> first,
> > > > and then
> > > > > > >> the
> > > > > > >> > > > > client
> > > > > > >> > > > > >> > side,
> > > > > > >> > > > > >> > > > > and we
> > > > > > >> > > > > >> > > > > > > > > would
> > > > > > >> > > > > >> > > > > > > > > > > only mark it as "released" by
> the
> > > > time
> > > > > > >> clients
> > > > > > >> > > > > >> > > implementations
> > > > > > >> > > > > >> > > > > are
> > > > > > >> > > > > >> > > > > > > > > done. At
> > > > > > >> > > > > >> > > > > > > > > > > that time, to enable the
> feature the
> > > > > > >> clients
> > > > > > >> > need to
> > > > > > >> > > > > >> > first
> > > > > > >> > > > > >> > > > > swap-in
> > > > > > >> > > > > >> > > > > > > the
> > > > > > >> > > > > >> > > > > > > > > > > bytecode with a rolling bounce
> and
> > > > then set
> > > > > > >> > the flag
> > > > > > >> > > > > >> > with a
> > > > > > >> > > > > >> > > > > second
> > > > > > >> > > > > >> > > > > > > > > rolling
> > > > > > >> > > > > >> > > > > > > > > > > bounce, and hence we feel it's
> > > > better to be
> > > > > > >> > released
> > > > > > >> > > > > >> in a
> > > > > > >> > > > > >> > > major
> > > > > > >> > > > > >> > > > > > > > > version.
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > > 3. We need to flesh out the
> > > > details of
> > > > > > >> the
> > > > > > >> > > > > migration
> > > > > > >> > > > > >> > > story.
> > > > > > >> > > > > >> > > > > It
> > > > > > >> > > > > >> > > > > > > > > sounds like
> > > > > > >> > > > > >> > > > > > > > > > > > we're saying we will support
> online
> > > > > > >> > migrations. Is
> > > > > > >> > > > > >> that
> > > > > > >> > > > > >> > > > > correct?
> > > > > > >> > > > > >> > > > > > > We
> > > > > > >> > > > > >> > > > > > > > > should
> > > > > > >> > > > > >> > > > > > > > > > > > explain this in detail. It
> could
> > > > also be
> > > > > > >> > done as a
> > > > > > >> > > > > >> > > separate
> > > > > > >> > > > > >> > > > > KIP,
> > > > > > >> > > > > >> > > > > > > if
> > > > > > >> > > > > >> > > > > > > > > it's
> > > > > > >> > > > > >> > > > > > > > > > > > easier.
> > > > > > >> > > > > >> > > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > Yes I think that's the part we
> can
> > > > be more
> > > > > > >> > concrete
> > > > > > >> > > > > >> about
> > > > > > >> > > > > >> > > for
> > > > > > >> > > > > >> > > > > sure
> > > > > > >> > > > > >> > > > > > > (and
> > > > > > >> > > > > >> > > > > > > > > > > this is related to your
> question 2)
> > > > above).
> > > > > > >> > We will
> > > > > > >> > > > > >> work
> > > > > > >> > > > > >> > on
> > > > > > >> > > > > >> > > > > making
> > > > > > >> > > > > >> > > > > > > it
> > > > > > >> > > > > >> > > > > > > > > more
> > > > > > >> > > > > >> > > > > > > > > > > explicit in parallel as we
> solicit
> > > > more
> > > > > > >> > feedback.
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > > 4. I am happy that we are
> pushing
> > > > the
> > > > > > >> > pattern
> > > > > > >> > > > > >> > > subscriptions
> > > > > > >> > > > > >> > > > > to
> > > > > > >> > > > > >> > > > > > > the
> > > > > > >> > > > > >> > > > > > > > > server,
> > > > > > >> > > > > >> > > > > > > > > > > > but it seems like there
> could be
> > > > some
> > > > > > >> tricky
> > > > > > >> > > > > >> > > compatibility
> > > > > > >> > > > > >> > > > > > > issues.
> > > > > > >> > > > > >> > > > > > > > > Will we
> > > > > > >> > > > > >> > > > > > > > > > > > have a mechanism for users to
> > > > detect that
> > > > > > >> > they
> > > > > > >> > > > > need
> > > > > > >> > > > > >> to
> > > > > > >> > > > > >> > > update
> > > > > > >> > > > > >> > > > > > > their
> > > > > > >> > > > > >> > > > > > > > > regex
> > > > > > >> > > > > >> > > > > > > > > > > > before switching to the new
> > > > protocol?
> > > > > > >> > > > > >> > > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > Yes I think we need some
> tooling for
> > > > > > >> non-java
> > > > > > >> > client
> > > > > > >> > > > > >> > users
> > > > > > >> > > > > >> > > to
> > > > > > >> > > > > >> > > > > sort
> > > > > > >> > > > > >> > > > > > > of
> > > > > > >> > > > > >> > > > > > > > > > > "dry-run" the client before
> > > > switching to
> > > > > > >> the
> > > > > > >> > new
> > > > > > >> > > > > >> > protocol.
> > > > > > >> > > > > >> > > I
> > > > > > >> > > > > >> > > > > do not
> > > > > > >> > > > > >> > > > > > > > > have a
> > > > > > >> > > > > >> > > > > > > > > > > specific idea on top of my head
> > > > though,
> > > > > > >> maybe
> > > > > > >> > others
> > > > > > >> > > > > >> like
> > > > > > >> > > > > >> > > @Matt
> > > > > > >> > > > > >> > > > > > > > > Howlett can
> > > > > > >> > > > > >> > > > > > > > > > > chime-in here?
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > > 5. Related to the last
> question,
> > > > will the
> > > > > > >> > Java
> > > > > > >> > > > > >> client
> > > > > > >> > > > > >> > > allow
> > > > > > >> > > > > >> > > > > the
> > > > > > >> > > > > >> > > > > > > > > users to
> > > > > > >> > > > > >> > > > > > > > > > > > stick with the current regex
> > > > engine for
> > > > > > >> > > > > >> compatibility
> > > > > > >> > > > > >> > > > > reasons?
> > > > > > >> > > > > >> > > > > > > For
> > > > > > >> > > > > >> > > > > > > > > example,
> > > > > > >> > > > > >> > > > > > > > > > > > it may be handy to keep using
> > > > client
> > > > > > >> based
> > > > > > >> > regex
> > > > > > >> > > > > at
> > > > > > >> > > > > >> > > first to
> > > > > > >> > > > > >> > > > > keep
> > > > > > >> > > > > >> > > > > > > > > > > > migrations simple and then
> migrate
> > > > to
> > > > > > >> > server based
> > > > > > >> > > > > >> > > regexes
> > > > > > >> > > > > >> > > > > as a
> > > > > > >> > > > > >> > > > > > > > > second
> > > > > > >> > > > > >> > > > > > > > > > > > step.
> > > > > > >> > > > > >> > > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > Honestly I have not thought
> about
> > > > that for
> > > > > > >> > java
> > > > > > >> > > > > >> clients,
> > > > > > >> > > > > >> > > and
> > > > > > >> > > > > >> > > > > we can
> > > > > > >> > > > > >> > > > > > > > > discuss
> > > > > > >> > > > > >> > > > > > > > > > > that. What kind of
> compatibility
> > > > issues do
> > > > > > >> > you have
> > > > > > >> > > > > in
> > > > > > >> > > > > >> > > mind?
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > > 6. When we say that the group
> > > > coordinator
> > > > > > >> > will be
> > > > > > >> > > > > >> > > > > responsible for
> > > > > > >> > > > > >> > > > > > > > > storing
> > > > > > >> > > > > >> > > > > > > > > > > > the configurations and that
> the
> > > > > > >> > configurations
> > > > > > >> > > > > will
> > > > > > >> > > > > >> be
> > > > > > >> > > > > >> > > > > deleted
> > > > > > >> > > > > >> > > > > > > when
> > > > > > >> > > > > >> > > > > > > > > the
> > > > > > >> > > > > >> > > > > > > > > > > > group is deleted. Will a
> > > > transition to
> > > > > > >> DEAD
> > > > > > >> > > > > trigger
> > > > > > >> > > > > >> > > deletion
> > > > > > >> > > > > >> > > > > of
> > > > > > >> > > > > >> > > > > > > > > > > > configurations?
> > > > > > >> > > > > >> > > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > Yes, since the DEAD state is an
> > > > ending
> > > > > > >> state
> > > > > > >> > (we
> > > > > > >> > > > > would
> > > > > > >> > > > > >> > only
> > > > > > >> > > > > >> > > > > > > transit to
> > > > > > >> > > > > >> > > > > > > > > that
> > > > > > >> > > > > >> > > > > > > > > > > state when the group is EMPTY
> and
> > > > also all
> > > > > > >> of
> > > > > > >> > its
> > > > > > >> > > > > >> > metadata
> > > > > > >> > > > > >> > > are
> > > > > > >> > > > > >> > > > > > > gone),
> > > > > > >> > > > > >> > > > > > > > > once
> > > > > > >> > > > > >> > > > > > > > > > > it's transited to DEAD this
> group
> > > > would
> > > > > > >> never
> > > > > > >> > be
> > > > > > >> > > > > >> revived.
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > > 7. Will the choice to store
> the
> > > > configs
> > > > > > >> in
> > > > > > >> > the
> > > > > > >> > > > > group
> > > > > > >> > > > > >> > > > > coordinator
> > > > > > >> > > > > >> > > > > > > > > make it
> > > > > > >> > > > > >> > > > > > > > > > > > harder to list all cluster
> configs
> > > > and
> > > > > > >> their
> > > > > > >> > > > > values?
> > > > > > >> > > > > >> > > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > That's a good question, and our
> > > > thoughts
> > > > > > >> are
> > > > > > >> > that
> > > > > > >> > > > > the
> > > > > > >> > > > > >> > > so-called
> > > > > > >> > > > > >> > > > > > > "group
> > > > > > >> > > > > >> > > > > > > > > > > configurations" are overrides
> of the
> > > > > > >> > cluster-level
> > > > > > >> > > > > >> > > > > configurations
> > > > > > >> > > > > >> > > > > > > > > > > customized per group so when an
> > > > admin list
> > > > > > >> > cluster
> > > > > > >> > > > > >> > configs
> > > > > > >> > > > > >> > > it's
> > > > > > >> > > > > >> > > > > > > okay to
> > > > > > >> > > > > >> > > > > > > > > > > list just the cluster-level
> > > > "defaults", not
> > > > > > >> > showing
> > > > > > >> > > > > >> any
> > > > > > >> > > > > >> > > > > per-group
> > > > > > >> > > > > >> > > > > > > > > > > customizations.
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > > 8. How would someone
> configure a
> > > > group
> > > > > > >> > before
> > > > > > >> > > > > >> starting
> > > > > > >> > > > > >> > > the
> > > > > > >> > > > > >> > > > > > > > > consumers? Have
> > > > > > >> > > > > >> > > > > > > > > > > > we considered allowing the
> explicit
> > > > > > >> > creation of
> > > > > > >> > > > > >> groups?
> > > > > > >> > > > > >> > > > > > > > > Alternatively, the
> > > > > > >> > > > > >> > > > > > > > > > > > configs could be decoupled
> from
> > > > the group
> > > > > > >> > > > > lifecycle.
> > > > > > >> > > > > >> > > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > The configs can be created
> before
> > > > the group
> > > > > > >> > itself
> > > > > > >> > > > > as
> > > > > > >> > > > > >> an
> > > > > > >> > > > > >> > > > > > > independent
> > > > > > >> > > > > >> > > > > > > > > entity
> > > > > > >> > > > > >> > > > > > > > > > > --- of course, this requires
> the
> > > > > > >> corresponding
> > > > > > >> > > > > >> request to
> > > > > > >> > > > > >> > > be
> > > > > > >> > > > > >> > > > > > > routed to
> > > > > > >> > > > > >> > > > > > > > > the
> > > > > > >> > > > > >> > > > > > > > > > > right coordinator based on the
> group
> > > > id ---
> > > > > > >> > the only
> > > > > > >> > > > > >> > thing
> > > > > > >> > > > > >> > > that
> > > > > > >> > > > > >> > > > > > > > > differs is,
> > > > > > >> > > > > >> > > > > > > > > > > when the group itself is gone
> we
> > > > also check
> > > > > > >> > if there
> > > > > > >> > > > > >> are
> > > > > > >> > > > > >> > > any
> > > > > > >> > > > > >> > > > > > > > > configuration
> > > > > > >> > > > > >> > > > > > > > > > > entities related to that group
> and
> > > > delete
> > > > > > >> as
> > > > > > >> > well.
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > Admittedly this indeed
> introduces an
> > > > > > >> > asymmetry on
> > > > > > >> > > > > the
> > > > > > >> > > > > >> > > creation
> > > > > > >> > > > > >> > > > > /
> > > > > > >> > > > > >> > > > > > > > > deletion
> > > > > > >> > > > > >> > > > > > > > > > > lifecycles of the config
> entities,
> > > > and we
> > > > > > >> > would like
> > > > > > >> > > > > >> to
> > > > > > >> > > > > >> > > hear
> > > > > > >> > > > > >> > > > > > > everyone's
> > > > > > >> > > > > >> > > > > > > > > > > feelings whether we should aim
> for
> > > > symmetry
> > > > > > >> > i.e.
> > > > > > >> > > > > >> totally
> > > > > > >> > > > > >> > > > > decouple
> > > > > > >> > > > > >> > > > > > > group
> > > > > > >> > > > > >> > > > > > > > > > > configs and hence not delete
> them at
> > > > all
> > > > > > >> when
> > > > > > >> > the
> > > > > > >> > > > > >> group
> > > > > > >> > > > > >> > is
> > > > > > >> > > > > >> > > > > gone,
> > > > > > >> > > > > >> > > > > > > but
> > > > > > >> > > > > >> > > > > > > > > always
> > > > > > >> > > > > >> > > > > > > > > > > require explicit deletion
> operations
> > > > by
> > > > > > >> > themselves.
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > > 9. Will the
> Consumer.subscribe
> > > > method for
> > > > > > >> > the Java
> > > > > > >> > > > > >> > client
> > > > > > >> > > > > >> > > > > still
> > > > > > >> > > > > >> > > > > > > take
> > > > > > >> > > > > >> > > > > > > > > a
> > > > > > >> > > > > >> > > > > > > > > > > > `java.util.regex.Pattern` of
> do we
> > > > have
> > > > > > >> to
> > > > > > >> > > > > >> introduce an
> > > > > > >> > > > > >> > > > > overload?
> > > > > > >> > > > > >> > > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > I think we do not need to
> introduce
> > > > an
> > > > > > >> > overload, but
> > > > > > >> > > > > >> I'm
> > > > > > >> > > > > >> > > all
> > > > > > >> > > > > >> > > > > ears
> > > > > > >> > > > > >> > > > > > > if
> > > > > > >> > > > > >> > > > > > > > > there
> > > > > > >> > > > > >> > > > > > > > > > > may be some compatibility
> issues
> > > > that we
> > > > > > >> may
> > > > > > >> > > > > overlook.
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > > 10. I agree with Justine
> that we
> > > > should
> > > > > > >> be
> > > > > > >> > clearer
> > > > > > >> > > > > >> > about
> > > > > > >> > > > > >> > > the
> > > > > > >> > > > > >> > > > > > > reason
> > > > > > >> > > > > >> > > > > > > > > to
> > > > > > >> > > > > >> > > > > > > > > > > > switch to
> IBP/metadata.version
> > > > from the
> > > > > > >> > feature
> > > > > > >> > > > > >> flag.
> > > > > > >> > > > > >> > > Maybe
> > > > > > >> > > > > >> > > > > we
> > > > > > >> > > > > >> > > > > > > mean
> > > > > > >> > > > > >> > > > > > > > > that we
> > > > > > >> > > > > >> > > > > > > > > > > > can switch the default for
> the
> > > > feature
> > > > > > >> flag
> > > > > > >> > to
> > > > > > >> > > > > true
> > > > > > >> > > > > >> > > based on
> > > > > > >> > > > > >> > > > > the
> > > > > > >> > > > > >> > > > > > > > > > > > metadata.version once we
> want to
> > > > make it
> > > > > > >> the
> > > > > > >> > > > > >> default.
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > 11. Some of the protocol APIs
> don't
> > > > mention
> > > > > > >> > the
> > > > > > >> > > > > >> required
> > > > > > >> > > > > >> > > ACLs,
> > > > > > >> > > > > >> > > > > it
> > > > > > >> > > > > >> > > > > > > > > would be
> > > > > > >> > > > > >> > > > > > > > > > > > good to add that for
> consistency.
> > > > > > >> > > > > >> > > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > Ack, we can certainly do that.
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > > 12. It is a bit odd that
> > > > > > >> > ConsumerGroupHeartbeat
> > > > > > >> > > > > >> > requires
> > > > > > >> > > > > >> > > > > "Read
> > > > > > >> > > > > >> > > > > > > > > Group" even
> > > > > > >> > > > > >> > > > > > > > > > > > though it seems to do more
> than
> > > > reading.
> > > > > > >> > > > > >> > > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > I had that thought myself as
> well,
> > > > but in
> > > > > > >> the
> > > > > > >> > end we
> > > > > > >> > > > > >> > could
> > > > > > >> > > > > >> > > not
> > > > > > >> > > > > >> > > > > > > find a
> > > > > > >> > > > > >> > > > > > > > > > > better alternative: adding
> Write
> > > > Group
> > > > > > >> seems
> > > > > > >> > an
> > > > > > >> > > > > >> overkill
> > > > > > >> > > > > >> > > here
> > > > > > >> > > > > >> > > > > > > since we
> > > > > > >> > > > > >> > > > > > > > > do
> > > > > > >> > > > > >> > > > > > > > > > > not have it elsewhere (we only
> have
> > > > Read /
> > > > > > >> > Delete
> > > > > > >> > > > > and
> > > > > > >> > > > > >> > > Describe
> > > > > > >> > > > > >> > > > > on
> > > > > > >> > > > > >> > > > > > > > > groups so
> > > > > > >> > > > > >> > > > > > > > > > > far). Would like to hear others
> > > > thoughts.
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > > 13. How is topic recreation
> > > > handled by
> > > > > > >> the
> > > > > > >> > > > > consumer
> > > > > > >> > > > > >> > with
> > > > > > >> > > > > >> > > the
> > > > > > >> > > > > >> > > > > new
> > > > > > >> > > > > >> > > > > > > > > group
> > > > > > >> > > > > >> > > > > > > > > > > > protocol? It would be good
> to have
> > > > a
> > > > > > >> > section on
> > > > > > >> > > > > >> this.
> > > > > > >> > > > > >> > > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > You mean with regex
> subscription
> > > > right? Yes
> > > > > > >> > we can
> > > > > > >> > > > > >> add a
> > > > > > >> > > > > >> > > > > section
> > > > > > >> > > > > >> > > > > > > about
> > > > > > >> > > > > >> > > > > > > > > > > that, but basically the idea
> is that
> > > > > > >> consumer
> > > > > > >> > would
> > > > > > >> > > > > be
> > > > > > >> > > > > >> > > totally
> > > > > > >> > > > > >> > > > > > > > > agnostic in
> > > > > > >> > > > > >> > > > > > > > > > > the new protocol as it's
> handled all
> > > > by the
> > > > > > >> > brokers.
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > > 14. The KIP mentions we will
> write
> > > > the
> > > > > > >> new
> > > > > > >> > > > > >> coordinator
> > > > > > >> > > > > >> > in
> > > > > > >> > > > > >> > > > > Java.
> > > > > > >> > > > > >> > > > > > > Even
> > > > > > >> > > > > >> > > > > > > > > though
> > > > > > >> > > > > >> > > > > > > > > > > > this is an implementation
> detail,
> > > > do we
> > > > > > >> > plan to
> > > > > > >> > > > > >> have a
> > > > > > >> > > > > >> > > new
> > > > > > >> > > > > >> > > > > gradle
> > > > > > >> > > > > >> > > > > > > > > module
> > > > > > >> > > > > >> > > > > > > > > > > > for it?
> > > > > > >> > > > > >> > > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > We have not thought about
> that. But
> > > > I think
> > > > > > >> > the
> > > > > > >> > > > > answer
> > > > > > >> > > > > >> > > should
> > > > > > >> > > > > >> > > > > be
> > > > > > >> > > > > >> > > > > > > yes.
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > > 15. Do we have a scalability
> goal
> > > > when it
> > > > > > >> > comes to
> > > > > > >> > > > > >> how
> > > > > > >> > > > > >> > > many
> > > > > > >> > > > > >> > > > > > > members
> > > > > > >> > > > > >> > > > > > > > > the new
> > > > > > >> > > > > >> > > > > > > > > > > > group protocol can support?
> > > > > > >> > > > > >> > > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > Within a group, I think we
> should
> > > > shoot for
> > > > > > >> > 1000s of
> > > > > > >> > > > > >> > > members.
> > > > > > >> > > > > >> > > > > But
> > > > > > >> > > > > >> > > > > > > that
> > > > > > >> > > > > >> > > > > > > > > > > scalability goals also depend
> on the
> > > > offset
> > > > > > >> > > > > management
> > > > > > >> > > > > >> > > (commit,
> > > > > > >> > > > > >> > > > > > > fetch)
> > > > > > >> > > > > >> > > > > > > > > > > capabilities of the coordinator
> > > > which we
> > > > > > >> did
> > > > > > >> > not
> > > > > > >> > > > > >> cover in
> > > > > > >> > > > > >> > > this
> > > > > > >> > > > > >> > > > > > > KIP, so
> > > > > > >> > > > > >> > > > > > > > > it's
> > > > > > >> > > > > >> > > > > > > > > > > hard to give a number that
> applies
> > > > > > >> > universally.
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > > 16. Did we consider having
> > > > > > >> > SubscribedTopidIds
> > > > > > >> > > > > >> instead
> > > > > > >> > > > > >> > of
> > > > > > >> > > > > >> > > > > > > > > > > > SubscribedTopicNames in
> > > > > > >> > > > > >> ConsumerGroupHeartbeatRequest?
> > > > > > >> > > > > >> > > Is the
> > > > > > >> > > > > >> > > > > > > idea
> > > > > > >> > > > > >> > > > > > > > > that
> > > > > > >> > > > > >> > > > > > > > > > > > since we have to resolve the
> regex
> > > > on the
> > > > > > >> > server,
> > > > > > >> > > > > we
> > > > > > >> > > > > >> > can
> > > > > > >> > > > > >> > > do
> > > > > > >> > > > > >> > > > > the
> > > > > > >> > > > > >> > > > > > > same
> > > > > > >> > > > > >> > > > > > > > > for
> > > > > > >> > > > > >> > > > > > > > > > > > the topic name? The
> difference is
> > > > that
> > > > > > >> > sending the
> > > > > > >> > > > > >> > regex
> > > > > > >> > > > > >> > > is
> > > > > > >> > > > > >> > > > > more
> > > > > > >> > > > > >> > > > > > > > > efficient
> > > > > > >> > > > > >> > > > > > > > > > > > whereas sending the topic
> names is
> > > > less
> > > > > > >> > efficient.
> > > > > > >> > > > > >> > > > > Furthermore,
> > > > > > >> > > > > >> > > > > > > > > delete and
> > > > > > >> > > > > >> > > > > > > > > > > > recreation is easier to
> handle if
> > > > we have
> > > > > > >> > topic
> > > > > > >> > > > > ids.
> > > > > > >> > > > > >> > > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > The main reason to still let
> the
> > > > clients
> > > > > > >> send
> > > > > > >> > names
> > > > > > >> > > > > >> is to
> > > > > > >> > > > > >> > > keep
> > > > > > >> > > > > >> > > > > the
> > > > > > >> > > > > >> > > > > > > > > > > reasoning of names -> ids on
> the
> > > > broker /
> > > > > > >> > admin
> > > > > > >> > > > > client
> > > > > > >> > > > > >> > > only.
> > > > > > >> > > > > >> > > > > Note
> > > > > > >> > > > > >> > > > > > > that
> > > > > > >> > > > > >> > > > > > > > > > > although we added topic id in
> > > > KIP-516, we
> > > > > > >> > never
> > > > > > >> > > > > >> > > implemented the
> > > > > > >> > > > > >> > > > > > > logic
> > > > > > >> > > > > >> > > > > > > > > on
> > > > > > >> > > > > >> > > > > > > > > > > consumer/producers leveraging
> the
> > > > related
> > > > > > >> > newer
> > > > > > >> > > > > >> versioned
> > > > > > >> > > > > >> > > RPCs,
> > > > > > >> > > > > >> > > > > > > > > instead we
> > > > > > >> > > > > >> > > > > > > > > > > just set the topic id as empty
> UUID.
> > > > We
> > > > > > >> want
> > > > > > >> > to keep
> > > > > > >> > > > > >> the
> > > > > > >> > > > > >> > > > > > > > > consumer/producer
> > > > > > >> > > > > >> > > > > > > > > > > to be thin and only delegate
> the
> > > > reasoning
> > > > > > >> on
> > > > > > >> > broker
> > > > > > >> > > > > >> and
> > > > > > >> > > > > >> > > > > > > potentially
> > > > > > >> > > > > >> > > > > > > > > admin
> > > > > > >> > > > > >> > > > > > > > > > > clients.
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > > Thanks,
> > > > > > >> > > > > >> > > > > > > > > > > > Ismael
> > > > > > >> > > > > >> > > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > > On Wed, Jul 6, 2022 at 10:45
> AM
> > > > David
> > > > > > >> Jacot
> > > > > > >> > > > > >> > > > > > > > > <dja...@confluent.io.invalid>
> > > > > > >> > > > > >> > > > > > > > > > > > wrote:
> > > > > > >> > > > > >> > > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > > > Hi all,
> > > > > > >> > > > > >> > > > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > > > I would like to start a
> > > > discussion
> > > > > > >> thread
> > > > > > >> > on
> > > > > > >> > > > > >> KIP-848:
> > > > > > >> > > > > >> > > The
> > > > > > >> > > > > >> > > > > Next
> > > > > > >> > > > > >> > > > > > > > > > > > > Generation of the Consumer
> > > > Rebalance
> > > > > > >> > Protocol.
> > > > > > >> > > > > >> With
> > > > > > >> > > > > >> > > this
> > > > > > >> > > > > >> > > > > KIP,
> > > > > > >> > > > > >> > > > > > > we
> > > > > > >> > > > > >> > > > > > > > > aim
> > > > > > >> > > > > >> > > > > > > > > > > > > to make the rebalance
> protocol
> > > > (for
> > > > > > >> > consumers)
> > > > > > >> > > > > >> more
> > > > > > >> > > > > >> > > > > reliable,
> > > > > > >> > > > > >> > > > > > > more
> > > > > > >> > > > > >> > > > > > > > > > > > > scalable, easier to
> implement for
> > > > > > >> > clients, and
> > > > > > >> > > > > >> easier
> > > > > > >> > > > > >> > > to
> > > > > > >> > > > > >> > > > > debug
> > > > > > >> > > > > >> > > > > > > for
> > > > > > >> > > > > >> > > > > > > > > > > > > operators.
> > > > > > >> > > > > >> > > > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > > > The KIP is here:
> > > > > > >> > > > > >> > > > >
> https://cwiki.apache.org/confluence/x/HhD1D.
> > > > > > >> > > > > >> > > > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > > > Please take a look and let
> me
> > > > know what
> > > > > > >> > you
> > > > > > >> > > > > think.
> > > > > > >> > > > > >> > > > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > > > Best,
> > > > > > >> > > > > >> > > > > > > > > > > > > David
> > > > > > >> > > > > >> > > > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > > > PS: I will be away from
> July
> > > > 18th to
> > > > > > >> > August 8th.
> > > > > > >> > > > > >> That
> > > > > > >> > > > > >> > > gives
> > > > > > >> > > > > >> > > > > > > you a
> > > > > > >> > > > > >> > > > > > > > > bit
> > > > > > >> > > > > >> > > > > > > > > > > > > of time to read and digest
> this
> > > > long
> > > > > > >> KIP.
> > > > > > >> > > > > >> > > > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > >
> > > > > > >> > > > > >> > > > > > > > > > > --
> > > > > > >> > > > > >> > > > > > > > > > > -- Guozhang
> > > > > > >> > > > > >> > > > > > > > >
> > > > > > >> > > > > >> > > > > > > >
> > > > > > >> > > > > >> > > > > > > >
> > > > > > >> > > > > >> > > > > > > > --
> > > > > > >> > > > > >> > > > > > > > -- Guozhang
> > > > > > >> > > > > >> > > > > > >
> > > > > > >> > > > > >> > > > >
> > > > > > >> > > > > >> > >
> > > > > > >> > > > > >> > >
> > > > > > >> > > > > >> >
> > > > > > >> > > > > >>
> > > > > > >> > > > > >>
> > > > > > >> > > > > >> --
> > > > > > >> > > > > >> -- Guozhang
> > > > > > >> > > > > >>
> > > > > > >> > > > > >
> > > > > > >> > > > >
> > > > > > >> >
> > > > > > >>
> > > > > > >
> > > >
>


-- 
-- Guozhang

Reply via email to