Hey Ryan, Yes, I believe any open bugs regarding the cooperative-sticky assignor should be considered as blockers to it being made the default, if not blockers to the release in general. I don't think they need to block the acceptance of this KIP, though, just possibly the implementation of it.
On Tue, Jun 8, 2021 at 11:42 AM Konstantine Karantasis <konstant...@confluent.io.invalid> wrote: > Thanks for the KIP Luke. > > Looks good overall. Just a few minor suggestions: > > 1. How about replacing > "Note that this change will also propagate to Connect." with -> "Note that > this change will also automatically be inherited by sink connectors, like > any other application that uses Kafka consumers, as long as a consumer > assignor is not explicitly defined in their configuration." > > The current sentence is a bit general and it would be good to avoid any > confusion with Connect's rebalancing protocol for tasks, which already > supports a single bounce upgrade given that rebalancing protocols in > Connect have a linear lineage until today. > > 2. Let's remove the placeholder text from the Rejected Alternatives and > simply state that there aren't any. Unless something is worth mentioning in > that section. > > Thanks, > Konstantine > > On Tue, Jun 8, 2021 at 5:20 AM Luke Chen <show...@gmail.com> wrote: > > > Hi Ryan, > > Thanks for the comments. The KAFKA-12896 > > <https://issues.apache.org/jira/browse/KAFKA-12896> is already set as > > blocker for V3.0, which means, V3.0 won't be released before KAFKA-12896 > > <https://issues.apache.org/jira/browse/KAFKA-12896> is fixed. I think > this > > KIP and KAFKA-12896 <https://issues.apache.org/jira/browse/KAFKA-12896> > > and > > work in parallel for V3.0 without conflict. > > > > Thank you. > > Luke > > > > On Tue, Jun 8, 2021 at 11:30 AM Ryan Leslie (BLOOMBERG/ 919 3RD A) < > > rles...@bloomberg.net> wrote: > > > > > Hey guys, > > > > > > Should open bugs concerning cooperative-sticky also be considered > > blockers > > > to making it the default? For example, KAFKA-12896 is perhaps still > being > > > investigated: > > > > > > https://issues.apache.org/jira/browse/KAFKA-12896 > > > > > > Thanks, > > > > > > Ryan > > > > > > From: dev@kafka.apache.org At: 06/07/21 19:37:45 UTC-4:00To: > > > dev@kafka.apache.org > > > Subject: Re: [DISCUSS] KIP-726: Make the CooperativeStickyAssignor as > the > > > default assignor > > > > > > Thanks Luke. We may as well get this KIP in to 3.0 so that we can fully > > > enable cooperative rebalancing > > > by default in 3.0 if we have KAFKA-12477 done in time, and if we don't > > then > > > there's no harm as it's > > > not going to change the behavior. > > > > > > On Wed, Jun 2, 2021 at 7:28 PM Luke Chen <show...@gmail.com> wrote: > > > > > > > Hi Sophie, > > > > Thanks for the reminder. Yes, I was thinking this KIP doesn't have to > > be > > > > put into a major release since it will be fully backward compatible, > so > > > no > > > > need to push it. Currently, if we want to work on this KIP, we need > > > > KAFKA-12477 and KAFKA-12487. But you're right, we can at least try > our > > > best > > > > to see if we can make it into V3.0 since cooperative rebalancing is a > > > major > > > > improvement. I'll kick off a vote later. > > > > > > > > Thank you. > > > > Luke > > > > > > > > On Thu, Jun 3, 2021 at 7:08 AM Sophie Blee-Goldman > > > > <sop...@confluent.io.invalid> wrote: > > > > > > > > > Hey Luke, > > > > > > > > > > It's been a while since the last update on this, which is mostly my > > > fault > > > > > for picking up > > > > > other things in the meantime. I'm planning to get back to work > > > > > on KAFKA-12477 next > > > > > week but there are minimal changes to the current implementation > > given > > > > the > > > > > proposal > > > > > I put forth earlier in this KIP discussion, so I think we're good > to > > > go. > > > > > > > > > > Although this KIP no longer requires a major release since it > should > > be > > > > > fully compatible, I > > > > > still hope we can get it in to 3.0 since cooperative rebalancing > is a > > > > major > > > > > improvement to > > > > > the life of a consumer group (and its operator). Can we make sure > the > > > KIP > > > > > reflects the latest > > > > > and then kick off a vote by next Monday at the latest so we can > make > > > KIP > > > > > freeze? > > > > > > > > > > Thanks! > > > > > Sophie > > > > > > > > > > On Fri, Apr 16, 2021 at 2:33 PM Guozhang Wang <wangg...@gmail.com> > > > > wrote: > > > > > > > > > > > 1) From user's perspective, it is always possible that a commit > > > within > > > > > > onPartitionsRevoked throw in practice (e.g. if the member missed > > the > > > > > > previous rebalance where its assigned partitions are already > > > > re-assigned) > > > > > > -- and the onPartitionsLost was introduced for that exact reason, > > > i.e. > > > > it > > > > > > is primarily for optimizations, but not for correctness > guarantees > > -- > > > > on > > > > > > the other hand, it would be surprising to users to see the commit > > > > returns > > > > > > and then later found it not going through. Given that, I'd > suggest > > we > > > > > still > > > > > > throw the exception right away. Regarding the flag itself > though, I > > > > agree > > > > > > that keeping it set until the next succeeded join group makes > sense > > > to > > > > be > > > > > > safer. > > > > > > > > > > > > 2) That's crystal, thank you for the clarification. > > > > > > > > > > > > On Wed, Apr 14, 2021 at 6:46 PM Sophie Blee-Goldman > > > > > > <sop...@confluent.io.invalid> wrote: > > > > > > > > > > > > > 1) Once the short-circuit is triggered, the member will > downgrade > > > to > > > > > the > > > > > > > EAGER protocol, but > > > > > > > won't necessarily try to rejoin the group right away. > > > > > > > > > > > > > > In the "happy path", the user has implemented #onPartitionsLost > > > > > correctly > > > > > > > and will not attempt > > > > > > > to commit partitions that are lost. And since these partitions > > have > > > > > > indeed > > > > > > > been revoked, the user > > > > > > > application should not attempt to commit those partitions after > > > this > > > > > > point. > > > > > > > In this case, there's no > > > > > > > reason for the consumer to immediately rejoin the group. Since > a > > > > > > > non-cooperative assignor was > > > > > > > selected, we know that all partitions have been assigned. This > > > member > > > > > can > > > > > > > continue on as usual, > > > > > > > processing the remaining un-revoked partitions and will follow > > the > > > > > EAGER > > > > > > > protocol in the next > > > > > > > rebalance. There's no user-facing impact or handling required; > > all > > > > that > > > > > > > happens is that the work > > > > > > > since the last commit on those revoked partitions has been > lost. > > > > > > > > > > > > > > In the less-happy path, the user has implemented > > #onPartitionsLost > > > > > > > incorrectly or not implemented > > > > > > > it at all, falling back on the default which invokes > > > > > #onPartitionsRevoked > > > > > > > which in turn will attempt to > > > > > > > commit those partitions during the rebalance callback. In this > > case > > > > we > > > > > > rely > > > > > > > on the flag to prevent > > > > > > > this commit request from being sent to the broker. > > > > > > > > > > > > > > Originally I was thinking we should throw a > CommitFailedException > > > up > > > > > > > through the #onPartitionsLost > > > > > > > callback, and eventually up through poll(), then rejoin the > > group. > > > > But > > > > > > now > > > > > > > I'm wondering if this is really > > > > > > > necessary -- the important point in all cases is just to > prevent > > > the > > > > > > > commit, but there's no reason the > > > > > > > consumer should not be allowed to continue processing its other > > > > > > partitions, > > > > > > > and it hasn't dropped out > > > > > > > of the group. What do you think about this slight amendment to > my > > > > > > original > > > > > > > proposal: if a user does end up > > > > > > > calling commit for whatever reason when invoking > > #onPartitionsLost, > > > > > we'll > > > > > > > just swallow the resulting > > > > > > > CommitFailedException. So the user application wouldn't see > > > anything, > > > > > and > > > > > > > the only impact would be > > > > > > > that these partitions were not able to commit those last set of > > > > offsets > > > > > > on > > > > > > > the revoked partitions. > > > > > > > > > > > > > > WDYT? My only concern there is that the user might have some > > > implicit > > > > > > > assumption that unless a > > > > > > > CommitFailedException was thrown, the offsets of revoked > > partitions > > > > > were > > > > > > > successfully committed > > > > > > > and they may have some downstream logic that should trigger > only > > in > > > > > this > > > > > > > case. If that's a concern, > > > > > > > then I would keep the original proposal which says a > > > > > > CommitFailedException > > > > > > > will be thrown up through > > > > > > > poll(), and leave it up to the user to decide if they want to > > > > trigger a > > > > > > new > > > > > > > rebalance/rejoin the group or not. > > > > > > > > > > > > > > Regarding the flag which prevents committing the revoked > > > partitions, > > > > > this > > > > > > > will need to continue > > > > > > > blocking such commit attempts until the next time the consumer > > > > rejoins > > > > > > the > > > > > > > group, ie until the end > > > > > > > of the next successful rebalance. Technically this shouldn't > > > matter, > > > > > > since > > > > > > > the consumer no longer > > > > > > > owns those partitions this member shouldn't attempt to commit > > them > > > > > > anyways. > > > > > > > Usually we can > > > > > > > rely on the broker rejecting commit attempts on partitions that > > are > > > > not > > > > > > > owned, in which case the > > > > > > > consumer will throw a CommitFailedException. This is similar, > > > except > > > > > that > > > > > > > we can't rely on the > > > > > > > broker having been informed of the change in ownership before > > this > > > > > > consumer > > > > > > > might attempt to > > > > > > > commit. So to avoid this race condition, we'll keep the > > > "blockCommit" > > > > > > flag > > > > > > > until the next rebalance > > > > > > > when we can be certain that the broker is clear on this > > > > > > > partition's ownership. > > > > > > > > > > > > > > 2) I guess maybe the wording here is unclear -- what I meant is > > > that > > > > > all > > > > > > > 3.0 applications will *eventually* > > > > > > > enable cooperative rebalancing in the stable state. This > doesn't > > > mean > > > > > > that > > > > > > > it will select COOPERATIVE > > > > > > > when it first starts up, and in order for this dynamic protocol > > > > upgrade > > > > > > to > > > > > > > be safe we do indeed need to > > > > > > > start off with EAGER and only upgrade once the selected > assignor > > > > > > indicates > > > > > > > that it's safe to do so. > > > > > > > (This only applies if multiple assignors are used, if the > > assignors > > > > are > > > > > > > "cooperative-sticky" only then it > > > > > > > will just start out and forever remain on COOPERATIVE, like in > > > > Streams) > > > > > > > > > > > > > > Since it's just the first rebalance, the choice of COOPERATIVE > vs > > > > EAGER > > > > > > > actually doesn't matter at > > > > > > > all since the consumer won't own any partitions until it's > joined > > > the > > > > > > > group. So we may as well continue > > > > > > > the initial protocol selection strategy of "highest commonly > > > > supported > > > > > > > protocol", but the point is that > > > > > > > 3.0 applications will upgrade to COOPERATIVE as soon as they > have > > > any > > > > > > > partitions. If you can think > > > > > > > of a better way to phrase "New applications on 3.0 will enable > > > > > > cooperative > > > > > > > rebalancing by default" then > > > > > > > please let me know. > > > > > > > > > > > > > > > > > > > > > Thanks for the response -- hope this makes sense so far, but > I'm > > > > happy > > > > > to > > > > > > > elaborate any aspects of the > > > > > > > proposal which aren't clear. I'll also update the ticket > > > description > > > > > > > for KAFKA-12477 with the latest. > > > > > > > > > > > > > > > > > > > > > On Wed, Apr 14, 2021 at 12:03 PM Guozhang Wang < > > wangg...@gmail.com > > > > > > > > > > wrote: > > > > > > > > > > > > > > > Hello Sophie, > > > > > > > > > > > > > > > > Thanks for the detailed explanation, a few clarifying > > questions: > > > > > > > > > > > > > > > > 1) when the short-circuit is triggered, what would happen > next? > > > > Would > > > > > > the > > > > > > > > consumers switch back to EAGER, and try to re-join the group, > > and > > > > > then > > > > > > > upon > > > > > > > > succeeding the next rebalance reset the flag to allow > > committing? > > > > Or > > > > > > > would > > > > > > > > we just fail the consumer immediately. > > > > > > > > > > > > > > > > 2) at the overview you mentioned "New applications on 3.0 > will > > > > enable > > > > > > > > cooperative rebalancing by default", but in the detailed > > > > description > > > > > as > > > > > > > > "With ["cooperative-sticky", "range”], the initial protocol > > will > > > be > > > > > > EAGER > > > > > > > > when the member first joins the group." which seems > > > contradictory? > > > > If > > > > > > we > > > > > > > > want to have cooperative behavior be the default, then with > the > > > > > > > > default ["cooperative-sticky", "range”] the member would > start > > > with > > > > > > > > COOPERATIVE protocol right away. > > > > > > > > > > > > > > > > > > > > > > > > Guozhang > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > On Mon, Apr 12, 2021 at 5:19 AM Chris Egerton > > > > > > > <chr...@confluent.io.invalid > > > > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > Whoops, small correction--meant to say > > > > > > > > > ConsumerRebalanceListener::onPartitionsLost, not > > > > > > > > Consumer::onPartitionsLost > > > > > > > > > > > > > > > > > > On Mon, Apr 12, 2021 at 8:17 AM Chris Egerton < > > > > chr...@confluent.io > > > > > > > > > > > > > > wrote: > > > > > > > > > > > > > > > > > > > Hi Sophie, > > > > > > > > > > > > > > > > > > > > This sounds fantastic. I've made a note on KAFKA-12487 > > about > > > > > being > > > > > > > sure > > > > > > > > > to > > > > > > > > > > implement Consumer::onPartitionsLost to avoid unnecessary > > > task > > > > > > > failures > > > > > > > > > on > > > > > > > > > > consumer protocol downgrade, but besides that, I don't > > think > > > > > things > > > > > > > > could > > > > > > > > > > get any smoother for Connect users or developers. The > > > automatic > > > > > > > > protocol > > > > > > > > > > upgrade/downgrade behavior appears safe, intuitive, and > > > > > pain-free. > > > > > > > > > > > > > > > > > > > > Really excited for this development and hoping we can see > > it > > > > come > > > > > > to > > > > > > > > > > fruition in time for the 3.0 release! > > > > > > > > > > > > > > > > > > > > Cheers, > > > > > > > > > > > > > > > > > > > > Chris > > > > > > > > > > > > > > > > > > > > On Fri, Apr 9, 2021 at 2:43 PM Sophie Blee-Goldman > > > > > > > > > > <sop...@confluent.io.invalid> wrote: > > > > > > > > > > > > > > > > > > > >> 1) Yes, all of the above will be part of KAFKA-12477 > (not > > > > > KIP-726) > > > > > > > > > >> > > > > > > > > > >> 2) No, KAFKA-12638 would be nice to have but I don't > think > > > > it's > > > > > > > > > >> appropriate > > > > > > > > > >> to remove > > > > > > > > > >> the default implementation of #onPartitionsLost in 3.0 > > since > > > > we > > > > > > > never > > > > > > > > > gave > > > > > > > > > >> any indication > > > > > > > > > >> yet that we intend to remove it > > > > > > > > > >> > > > > > > > > > >> 3) Yes, this would be similar to when a Consumer drops > out > > > of > > > > > the > > > > > > > > group. > > > > > > > > > >> It's always been > > > > > > > > > >> possible for a member to miss a rebalance and have its > > > > partition > > > > > > be > > > > > > > > > >> reassigned to another > > > > > > > > > >> member, during which time both members would claim to > own > > > said > > > > > > > > > partition. > > > > > > > > > >> But this is safe > > > > > > > > > >> because the member who dropped out is blocked from > > > committing > > > > > > > offsets > > > > > > > > on > > > > > > > > > >> that partition. > > > > > > > > > >> > > > > > > > > > >> On Fri, Apr 9, 2021 at 2:46 AM Luke Chen < > > show...@gmail.com > > > > > > > > > > wrote: > > > > > > > > > >> > > > > > > > > > >> > Hi Sophie, > > > > > > > > > >> > That sounds great to take care of each case I can > think > > > of. > > > > > > > > > >> > Questions: > > > > > > > > > >> > 1. Do you mean the short-Circuit will also be > > implemented > > > in > > > > > > > > > >> KAFKA-12477? > > > > > > > > > >> > 2. I don't think KAFKA-12638 is the blocker of this > > > KIP-726, > > > > > Am > > > > > > I > > > > > > > > > right? > > > > > > > > > >> > 3. So, does that mean we still have possibility to > have > > > > > multiple > > > > > > > > > >> consumer > > > > > > > > > >> > owned the same topic partition? And in this situation, > > we > > > > > avoid > > > > > > > them > > > > > > > > > >> doing > > > > > > > > > >> > committing, and waiting for next rebalance (should be > > > soon). > > > > > Is > > > > > > my > > > > > > > > > >> > understanding correct? > > > > > > > > > >> > > > > > > > > > > >> > Thank you very much for finding this great solution. > > > > > > > > > >> > > > > > > > > > > >> > Luke > > > > > > > > > >> > > > > > > > > > > >> > On Fri, Apr 9, 2021 at 11:37 AM Sophie Blee-Goldman > > > > > > > > > >> > <sop...@confluent.io.invalid> wrote: > > > > > > > > > >> > > > > > > > > > > >> > > Alright, here's the detailed proposal for > KAFKA-12477. > > > > This > > > > > > > > assumes > > > > > > > > > we > > > > > > > > > >> > will > > > > > > > > > >> > > change the default assignor to > ["cooperative-sticky", > > > > > "range"] > > > > > > > in > > > > > > > > > >> > KIP-726. > > > > > > > > > >> > > It also acknowledges that users may attempt any kind > > of > > > > > > upgrade > > > > > > > > > >> without > > > > > > > > > >> > > reading the docs, and so we need to put in > safeguards > > > > > against > > > > > > > data > > > > > > > > > >> > > corruption rather than assume everyone will follow > the > > > > safe > > > > > > > > upgrade > > > > > > > > > >> path. > > > > > > > > > >> > > > > > > > > > > > >> > > With this proposal, > > > > > > > > > >> > > 1) New applications on 3.0 will enable cooperative > > > > > rebalancing > > > > > > > by > > > > > > > > > >> default > > > > > > > > > >> > > 2) Existing applications which don’t set an assignor > > can > > > > > > safely > > > > > > > > > >> upgrade > > > > > > > > > >> > to > > > > > > > > > >> > > 3.0 using a single rolling bounce with no extra > steps, > > > and > > > > > > will > > > > > > > > > >> > > automatically transition to cooperative rebalancing > > > > > > > > > >> > > 3) Existing applications which do set an assignor > that > > > > uses > > > > > > > EAGER > > > > > > > > > can > > > > > > > > > >> > > likewise upgrade their applications to COOPERATIVE > > with > > > a > > > > > > single > > > > > > > > > >> rolling > > > > > > > > > >> > > bounce > > > > > > > > > >> > > 4) Once on 3.0, applications can safely go back and > > > forth > > > > > > > between > > > > > > > > > >> EAGER > > > > > > > > > >> > and > > > > > > > > > >> > > COOPERATIVE > > > > > > > > > >> > > 5) Applications can safely downgrade away from 3.0 > > > > > > > > > >> > > > > > > > > > > > >> > > The high-level idea for dynamic protocol upgrades is > > > that > > > > > the > > > > > > > > group > > > > > > > > > >> will > > > > > > > > > >> > > leverage the assignor selected by the group > > coordinator > > > to > > > > > > > > determine > > > > > > > > > >> when > > > > > > > > > >> > > it’s safe to upgrade to COOPERATIVE, and trigger a > > > > fail-safe > > > > > > to > > > > > > > > > >> protect > > > > > > > > > >> > the > > > > > > > > > >> > > group in case of rare events or user > misconfiguration. > > > The > > > > > > group > > > > > > > > > >> > > coordinator selects the most preferred assignor > that’s > > > > > > supported > > > > > > > > by > > > > > > > > > >> all > > > > > > > > > >> > > members of the group, so we know that all members > will > > > > > support > > > > > > > > > >> > COOPERATIVE > > > > > > > > > >> > > once we receive the “cooperative-sticky” assignor > > after > > > a > > > > > > > > rebalance. > > > > > > > > > >> At > > > > > > > > > >> > > this point, each member can upgrade their own > protocol > > > to > > > > > > > > > COOPERATIVE. > > > > > > > > > >> > > However, there may be situations in which an EAGER > > > member > > > > > may > > > > > > > join > > > > > > > > > the > > > > > > > > > >> > > group even after upgrading to COOPERATIVE. For > > example, > > > > > > during a > > > > > > > > > >> rolling > > > > > > > > > >> > > upgrade if the last remaining member on the old > > bytecode > > > > > > misses > > > > > > > a > > > > > > > > > >> > > rebalance, the other members will be allowed to > > upgrade > > > to > > > > > > > > > >> COOPERATIVE. > > > > > > > > > >> > If > > > > > > > > > >> > > the old member rejoins and is chosen to be the group > > > > leader > > > > > > > before > > > > > > > > > >> it’s > > > > > > > > > >> > > upgraded to 3.0, it won’t be aware that the other > > > members > > > > of > > > > > > the > > > > > > > > > group > > > > > > > > > >> > have > > > > > > > > > >> > > not yet revoked their partitions when computing the > > > > > > assignment. > > > > > > > > > >> > > > > > > > > > > > >> > > Short Circuit: > > > > > > > > > >> > > The risk of mixing the cooperative and eager > > rebalancing > > > > > > > protocols > > > > > > > > > is > > > > > > > > > >> > that > > > > > > > > > >> > > a partition may be assigned to one member while it > has > > > yet > > > > > to > > > > > > be > > > > > > > > > >> revoked > > > > > > > > > >> > > from its previous owner. The danger is that the new > > > owner > > > > > may > > > > > > > > begin > > > > > > > > > >> > > processing and committing offsets for this partition > > > while > > > > > the > > > > > > > > > >> previous > > > > > > > > > >> > > owner is also committing offsets in its > > > > #onPartitionsRevoked > > > > > > > > > callback, > > > > > > > > > >> > > which is invoked at the end of the rebalance in the > > > > > > cooperative > > > > > > > > > >> protocol. > > > > > > > > > >> > > This can result in these consumers overwriting each > > > > other’s > > > > > > > > offsets > > > > > > > > > >> and > > > > > > > > > >> > > getting a corrupted view of the partition. Note that > > > it’s > > > > > not > > > > > > > > > >> possible to > > > > > > > > > >> > > commit during a rebalance, so we can protect against > > > > offset > > > > > > > > > >> corruption by > > > > > > > > > >> > > blocking further commits after we detect that the > > group > > > > > leader > > > > > > > may > > > > > > > > > not > > > > > > > > > >> > > understand COOPERATIVE, but before we invoke > > > > > > > #onPartitionsRevoked. > > > > > > > > > >> This > > > > > > > > > >> > is > > > > > > > > > >> > > the “short-circuit” — if we detect that the group is > > in > > > an > > > > > > > unsafe > > > > > > > > > >> state, > > > > > > > > > >> > we > > > > > > > > > >> > > invoke #onPartitionsLost instead of > > #onPartitionsRevoked > > > > and > > > > > > > > > >> explicitly > > > > > > > > > >> > > prevent offsets from being committed on those > revoked > > > > > > > partitions. > > > > > > > > > >> > > > > > > > > > > > >> > > Consumer procedure: > > > > > > > > > >> > > Upon startup, the consumer will initially select the > > > > highest > > > > > > > > > >> > > commonly-supported protocol across its configured > > > > assignors. > > > > > > > With > > > > > > > > > >> > > ["cooperative-sticky", "range”], the initial > protocol > > > will > > > > > be > > > > > > > > EAGER > > > > > > > > > >> when > > > > > > > > > >> > > the member first joins the group. Following a > > rebalance, > > > > > each > > > > > > > > member > > > > > > > > > >> will > > > > > > > > > >> > > check the selected assignor. If the chosen assignor > > > > supports > > > > > > > > > >> COOPERATIVE, > > > > > > > > > >> > > the member can upgrade their used protocol to > > > COOPERATIVE > > > > > and > > > > > > no > > > > > > > > > >> further > > > > > > > > > >> > > action is required. If the member is already on > > > > COOPERATIVE > > > > > > but > > > > > > > > the > > > > > > > > > >> > > selected assignor does NOT support it, then we need > to > > > > > trigger > > > > > > > the > > > > > > > > > >> > > short-circuit. In this case we will invoke > > > > #onPartitionsLost > > > > > > > > instead > > > > > > > > > >> of > > > > > > > > > >> > > #onPartitionsRevoked, and set a flag to block any > > > attempts > > > > > at > > > > > > > > > >> committing > > > > > > > > > >> > > those partitions which have been revoked. If a > commit > > is > > > > > > > > attempted, > > > > > > > > > as > > > > > > > > > >> > may > > > > > > > > > >> > > be the case if the user does not implement > > > > #onPartitionsLost > > > > > > > (see > > > > > > > > > >> > > KAFKA-12638), we will throw a CommitFailedException > > > which > > > > > will > > > > > > > be > > > > > > > > > >> bubbled > > > > > > > > > >> > > up through poll() after completing the rebalance. > The > > > > member > > > > > > > will > > > > > > > > > then > > > > > > > > > >> > > downgrade its protocol to EAGER for the next > > rebalance. > > > > > > > > > >> > > > > > > > > > > > >> > > Let me know what you think, > > > > > > > > > >> > > Sophie > > > > > > > > > >> > > > > > > > > > > > >> > > On Fri, Apr 2, 2021 at 7:08 PM Luke Chen < > > > > show...@gmail.com > > > > > > > > > > > > > > wrote: > > > > > > > > > >> > > > > > > > > > > > >> > > > Hi Sophie, > > > > > > > > > >> > > > Making the default to "cooperative-sticky, range" > > is a > > > > > smart > > > > > > > > idea, > > > > > > > > > >> to > > > > > > > > > >> > > > ensure we can at least fall back to rangeAssignor > if > > > > > > consumers > > > > > > > > are > > > > > > > > > >> not > > > > > > > > > >> > > > following our recommended upgrade path. I updated > > the > > > > KIP > > > > > > > > > >> accordingly. > > > > > > > > > >> > > > > > > > > > > > > >> > > > Hi Chris, > > > > > > > > > >> > > > No problem, I updated the KIP to include the > change > > in > > > > > > > Connect. > > > > > > > > > >> > > > > > > > > > > > > >> > > > Thank you very much. > > > > > > > > > >> > > > > > > > > > > > > >> > > > Luke > > > > > > > > > >> > > > > > > > > > > > > >> > > > On Thu, Apr 1, 2021 at 3:24 AM Chris Egerton > > > > > > > > > >> > <chr...@confluent.io.invalid > > > > > > > > > >> > > > > > > > > > > > > >> > > > wrote: > > > > > > > > > >> > > > > > > > > > > > > >> > > > > Hi all, > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > @Sophie - I like the sound of the dual-protocol > > > > default. > > > > > > The > > > > > > > > > >> smooth > > > > > > > > > >> > > > upgrade > > > > > > > > > >> > > > > path it permits sounds fantastic! > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > @Luke - Do you think we can also include Connect > > in > > > > this > > > > > > > KIP? > > > > > > > > > >> Right > > > > > > > > > >> > now > > > > > > > > > >> > > > we > > > > > > > > > >> > > > > don't set any custom partition assignment > > strategies > > > > for > > > > > > the > > > > > > > > > >> consumer > > > > > > > > > >> > > > > groups we bring up for sink tasks, and if we > > > continue > > > > to > > > > > > > just > > > > > > > > > use > > > > > > > > > >> the > > > > > > > > > >> > > > > default, the assignment strategy for those > > consumer > > > > > groups > > > > > > > > would > > > > > > > > > >> > change > > > > > > > > > >> > > > on > > > > > > > > > >> > > > > Connect clusters once people upgrade to 3.0. I > > think > > > > > this > > > > > > is > > > > > > > > > fine > > > > > > > > > >> > > > (assuming > > > > > > > > > >> > > > > we can take care of > > > > > > > > > >> > https://issues.apache.org/jira/browse/KAFKA-12487 > > > > > > > > > >> > > > > before then, which I'm fairly optimistic about), > > but > > > > it > > > > > > > might > > > > > > > > be > > > > > > > > > >> > worth > > > > > > > > > >> > > a > > > > > > > > > >> > > > > sentence or two in the KIP explaining that the > > > change > > > > in > > > > > > > > default > > > > > > > > > >> will > > > > > > > > > >> > > > > intentionally propagate to Connect. And, if we > > think > > > > > > Connect > > > > > > > > > >> should > > > > > > > > > >> > be > > > > > > > > > >> > > > left > > > > > > > > > >> > > > > out of this change and stay on the range > assignor > > > > > instead, > > > > > > > we > > > > > > > > > >> should > > > > > > > > > >> > > > > probably call that fact out in the KIP as well > and > > > > state > > > > > > > that > > > > > > > > > >> Connect > > > > > > > > > >> > > > will > > > > > > > > > >> > > > > now override the default partition assignment > > > strategy > > > > > to > > > > > > be > > > > > > > > the > > > > > > > > > >> > range > > > > > > > > > >> > > > > assignor (assuming the user hasn't specified a > > value > > > > for > > > > > > > > > >> > > > > consumer.partition.assignment.strategy in their > > > worker > > > > > > > config > > > > > > > > or > > > > > > > > > >> for > > > > > > > > > >> > > > > consumer.override.partition.assignment.strategy > in > > > > their > > > > > > > > > connector > > > > > > > > > >> > > > config). > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > Cheers, > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > Chris > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > On Wed, Mar 31, 2021 at 12:18 AM Sophie > > Blee-Goldman > > > > > > > > > >> > > > > <sop...@confluent.io.invalid> wrote: > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > Ok I'm still fleshing out all the details of > > > > > KAFKA-12477 > > > > > > > > but I > > > > > > > > > >> > think > > > > > > > > > >> > > we > > > > > > > > > >> > > > > can > > > > > > > > > >> > > > > > simplify some things a bit, and avoid > > > > > > > > > >> > > > > > any kind of "fail-fast" which will require > user > > > > > > > > intervention. > > > > > > > > > In > > > > > > > > > >> > > fact I > > > > > > > > > >> > > > > > think we can avoid requiring the user to make > > > > > > > > > >> > > > > > any changes at all for KIP-726, so we don't > have > > > to > > > > > > worry > > > > > > > > > about > > > > > > > > > >> > > whether > > > > > > > > > >> > > > > > they actually read our documentation: > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > Instead of making ["cooperative-sticky"] the > > > > default, > > > > > we > > > > > > > > > change > > > > > > > > > >> the > > > > > > > > > >> > > > > default > > > > > > > > > >> > > > > > to ["cooperative-sticky", "range"]. > > > > > > > > > >> > > > > > Since "range" is the old default, this is > > > equivalent > > > > > to > > > > > > > the > > > > > > > > > >> first > > > > > > > > > >> > > > rolling > > > > > > > > > >> > > > > > bounce of the safe upgrade path in KIP-429. > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > Of course this also means that under the > current > > > > > > protocol > > > > > > > > > >> selection > > > > > > > > > >> > > > > > mechanism we won't actually upgrade to > > > > > > > > > >> > > > > > cooperative rebalancing with the default > > assignor. > > > > But > > > > > > > > that's > > > > > > > > > >> where > > > > > > > > > >> > > > > > KAFKA-12477 will come in. > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > @Guozhang Wang <guozh...@confluent.io> I'll > > get > > > > back > > > > > > to > > > > > > > > you > > > > > > > > > >> with > > > > > > > > > >> > a > > > > > > > > > >> > > > > > concrete proposal and answer your questions, I > > > just > > > > > want > > > > > > > to > > > > > > > > > >> point > > > > > > > > > >> > out > > > > > > > > > >> > > > > > that it's possible to side-step the risk of > > users > > > > > > shooting > > > > > > > > > >> > themselves > > > > > > > > > >> > > > in > > > > > > > > > >> > > > > > the foot (well, at least in this one specific > > > case, > > > > > > > > > >> > > > > > obviously they always find a way) > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > On Tue, Mar 30, 2021 at 10:37 AM Guozhang > Wang < > > > > > > > > > >> wangg...@gmail.com > > > > > > > > > >> > > > > > > > > > > > >> > > > > wrote: > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > Hi Sophie, > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > My question is more related to KAFKA-12477, > > but > > > > > since > > > > > > > your > > > > > > > > > >> latest > > > > > > > > > >> > > > > replies > > > > > > > > > >> > > > > > > are on this thread I figured we can > follow-up > > on > > > > the > > > > > > > same > > > > > > > > > >> venue. > > > > > > > > > >> > > Just > > > > > > > > > >> > > > > so > > > > > > > > > >> > > > > > I > > > > > > > > > >> > > > > > > understand your latest comments above about > > the > > > > > > > approach: > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > * I think, we would need to persist this > > > decision > > > > so > > > > > > > that > > > > > > > > > the > > > > > > > > > >> > group > > > > > > > > > >> > > > > would > > > > > > > > > >> > > > > > > never go back to the eager protocol, this > bit > > > > would > > > > > be > > > > > > > > > >> written to > > > > > > > > > >> > > the > > > > > > > > > >> > > > > > > internal topic's assignment message. Is that > > > > > correct? > > > > > > > > > >> > > > > > > * Maybe you can describe the steps, after > the > > > > group > > > > > > has > > > > > > > > > >> decided > > > > > > > > > >> > to > > > > > > > > > >> > > > move > > > > > > > > > >> > > > > > > forward with cooperative protocols, when: > > > > > > > > > >> > > > > > > 1) a new member joined the group with the > old > > > > > version, > > > > > > > and > > > > > > > > > >> hence > > > > > > > > > >> > > only > > > > > > > > > >> > > > > > > recognized eager protocol and executing the > > > eager > > > > > > > protocol > > > > > > > > > >> with > > > > > > > > > >> > its > > > > > > > > > >> > > > > first > > > > > > > > > >> > > > > > > rebalance, what would happen. > > > > > > > > > >> > > > > > > 2) in addition to 1), the new member joined > > the > > > > > group > > > > > > > with > > > > > > > > > the > > > > > > > > > >> > old > > > > > > > > > >> > > > > > version > > > > > > > > > >> > > > > > > and only recognized the old subscription > > format, > > > > and > > > > > > was > > > > > > > > > >> selected > > > > > > > > > >> > > as > > > > > > > > > >> > > > > the > > > > > > > > > >> > > > > > > leader, what would happen. > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > Guozhang > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > On Mon, Mar 29, 2021 at 10:30 PM Luke Chen < > > > > > > > > > show...@gmail.com > > > > > > > > > >> > > > > > > > > > > >> > > > wrote: > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > Hi Sophie & Ismael, > > > > > > > > > >> > > > > > > > Thank you for your feedback. > > > > > > > > > >> > > > > > > > No problem, let's pause this KIP and wait > > for > > > > this > > > > > > > > > >> improvement: > > > > > > > > > >> > > > > > > KAFKA-12477 > > > > > > > > > >> > > > > > > > < > > > > > https://issues.apache.org/jira/browse/KAFKA-12477 > > > > > > >. > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > Stay tuned :) > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > Thank you. > > > > > > > > > >> > > > > > > > Luke > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > On Tue, Mar 30, 2021 at 3:14 AM Ismael > Juma > > < > > > > > > > > > >> ism...@juma.me.uk > > > > > > > > > >> > > > > > > > > > > > >> > > > > wrote: > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > Hi Sophie, > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > I didn't analyze the KIP in detail, but > > the > > > > two > > > > > > > > > >> suggestions > > > > > > > > > >> > you > > > > > > > > > >> > > > > > > mentioned > > > > > > > > > >> > > > > > > > > sound like great improvements. > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > A bit more context: breaking changes > for a > > > > > widely > > > > > > > used > > > > > > > > > >> > product > > > > > > > > > >> > > > like > > > > > > > > > >> > > > > > > Kafka > > > > > > > > > >> > > > > > > > > are costly and hence why we try as hard > as > > > we > > > > > can > > > > > > to > > > > > > > > > avoid > > > > > > > > > >> > > them. > > > > > > > > > >> > > > > When > > > > > > > > > >> > > > > > > it > > > > > > > > > >> > > > > > > > > comes to the brokers, they are often > > managed > > > > by > > > > > a > > > > > > > > > central > > > > > > > > > >> > group > > > > > > > > > >> > > > (or > > > > > > > > > >> > > > > > > > they're > > > > > > > > > >> > > > > > > > > in the Cloud), so they're a bit easier > to > > > > > manage. > > > > > > > Even > > > > > > > > > so, > > > > > > > > > >> > it's > > > > > > > > > >> > > > > still > > > > > > > > > >> > > > > > > > > possible to upgrade from 0.8.x directly > to > > > 2.7 > > > > > > since > > > > > > > > all > > > > > > > > > >> > > protocol > > > > > > > > > >> > > > > > > > versions > > > > > > > > > >> > > > > > > > > are still supported. When it comes to > the > > > > basic > > > > > > > > clients > > > > > > > > > >> > > > (producer, > > > > > > > > > >> > > > > > > > > consumer, admin client), they're often > > > > embedded > > > > > in > > > > > > > > > >> > applications > > > > > > > > > >> > > > so > > > > > > > > > >> > > > > we > > > > > > > > > >> > > > > > > > have > > > > > > > > > >> > > > > > > > > to be even more conservative. > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > Ismael > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > On Mon, Mar 29, 2021 at 10:50 AM Sophie > > > > > > Blee-Goldman > > > > > > > > > >> > > > > > > > > <sop...@confluent.io.invalid> wrote: > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > Ismael, > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > It seems like given 3.0 is a breaking > > > > release, > > > > > > we > > > > > > > > have > > > > > > > > > >> to > > > > > > > > > >> > > rely > > > > > > > > > >> > > > on > > > > > > > > > >> > > > > > > users > > > > > > > > > >> > > > > > > > > > being aware of this and responsible > > > > > > > > > >> > > > > > > > > > enough to read the upgrade guide. > > > Otherwise > > > > we > > > > > > > could > > > > > > > > > >> never > > > > > > > > > >> > > ever > > > > > > > > > >> > > > > > make > > > > > > > > > >> > > > > > > > any > > > > > > > > > >> > > > > > > > > > breaking changes beyond just > > > > > > > > > >> > > > > > > > > > removing deprecated APIs or other > > > > > > > > compilation-breaking > > > > > > > > > >> > errors > > > > > > > > > >> > > > > that > > > > > > > > > >> > > > > > > > would > > > > > > > > > >> > > > > > > > > be > > > > > > > > > >> > > > > > > > > > immediately visible, no? > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > That said, obviously it's better to > > have a > > > > > > > > > >> circuit-breaker > > > > > > > > > >> > > that > > > > > > > > > >> > > > > > will > > > > > > > > > >> > > > > > > > fail > > > > > > > > > >> > > > > > > > > > fast in case of a user > misconfiguration > > > > > > > > > >> > > > > > > > > > rather than silently corrupting the > > > consumer > > > > > > group > > > > > > > > > >> state -- > > > > > > > > > >> > > eg > > > > > > > > > >> > > > > for > > > > > > > > > >> > > > > > > two > > > > > > > > > >> > > > > > > > > > consumers to overlap in their > ownership > > > > > > > > > >> > > > > > > > > > of the same partition(s). We could > > > > definitely > > > > > > > > > implement > > > > > > > > > >> > this, > > > > > > > > > >> > > > and > > > > > > > > > >> > > > > > now > > > > > > > > > >> > > > > > > > > that > > > > > > > > > >> > > > > > > > > > I think about it this might solve a > > > > > > > > > >> > > > > > > > > > related problem in KAFKA-12477 > > > > > > > > > >> > > > > > > > > > < > > > > > > > https://issues.apache.org/jira/browse/KAFKA-12477 > > > > > > > > >. > > > > > > > > > We > > > > > > > > > >> > just > > > > > > > > > >> > > > > add a > > > > > > > > > >> > > > > > > new > > > > > > > > > >> > > > > > > > > > field to the Assignment in which the > > group > > > > > > leader > > > > > > > > > >> > > > > > > > > > indicates whether it's on a recent > > enough > > > > > > version > > > > > > > to > > > > > > > > > >> > > understand > > > > > > > > > >> > > > > > > > > cooperative > > > > > > > > > >> > > > > > > > > > rebalancing. If an upgraded member > > > > > > > > > >> > > > > > > > > > joins the group, it'll only be allowed > > to > > > > > start > > > > > > > > > >> following > > > > > > > > > >> > the > > > > > > > > > >> > > > new > > > > > > > > > >> > > > > > > > > > rebalancing protocol after receiving > the > > > > > > go-ahead > > > > > > > > > >> > > > > > > > > > from the group leader. > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > If we do go ahead and add this new > field > > > in > > > > > the > > > > > > > > > >> Assignment > > > > > > > > > >> > > then > > > > > > > > > >> > > > > I'm > > > > > > > > > >> > > > > > > > > pretty > > > > > > > > > >> > > > > > > > > > confident we can reduce the number > > > > > > > > > >> > > > > > > > > > of required rolling bounces to just > one > > > with > > > > > > > > > KAFKA-12477 > > > > > > > > > >> > > > > > > > > > < > > > > > > > https://issues.apache.org/jira/browse/KAFKA-12477 > > > > > > > > >. > > > > > > > > > In > > > > > > > > > >> > that > > > > > > > > > >> > > > > case > > > > > > > > > >> > > > > > we > > > > > > > > > >> > > > > > > > > > should > > > > > > > > > >> > > > > > > > > > be in much better shape to > > > > > > > > > >> > > > > > > > > > feel good about changing the default > to > > > the > > > > > > > > > >> > > > > > > CooperativeStickyAssignor. > > > > > > > > > >> > > > > > > > > How > > > > > > > > > >> > > > > > > > > > does that sound? > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > To be clear, I'm not proposing we do > > this > > > as > > > > > > part > > > > > > > of > > > > > > > > > >> > KIP-726. > > > > > > > > > >> > > > > > Here's > > > > > > > > > >> > > > > > > my > > > > > > > > > >> > > > > > > > > > take: > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > Let's pause this KIP while I work on > > > making > > > > > > these > > > > > > > > two > > > > > > > > > >> > > > > improvements > > > > > > > > > >> > > > > > in > > > > > > > > > >> > > > > > > > > > KAFKA-12477 < > > > > > > > > > >> > > https://issues.apache.org/jira/browse/KAFKA-12477 > > > > > > > > > >> > > > >. > > > > > > > > > >> > > > > > > Once > > > > > > > > > >> > > > > > > > I > > > > > > > > > >> > > > > > > > > > can > > > > > > > > > >> > > > > > > > > > confirm the > > > > > > > > > >> > > > > > > > > > short-circuit and single rolling > bounce > > > will > > > > > be > > > > > > > > > >> available > > > > > > > > > >> > for > > > > > > > > > >> > > > > 3.0, > > > > > > > > > >> > > > > > > I'll > > > > > > > > > >> > > > > > > > > > report back on this thread. Then we > can > > > move > > > > > > > > > >> > > > > > > > > > forward with this KIP again. > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > Thoughts? > > > > > > > > > >> > > > > > > > > > Sophie > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > On Mon, Mar 29, 2021 at 12:01 AM Luke > > > Chen < > > > > > > > > > >> > > show...@gmail.com> > > > > > > > > > >> > > > > > > wrote: > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > Hi Ismael, > > > > > > > > > >> > > > > > > > > > > Thanks for your good question. > Answer > > > them > > > > > > > below: > > > > > > > > > >> > > > > > > > > > > *1. Are we saying that every > consumer > > > > > upgraded > > > > > > > > would > > > > > > > > > >> have > > > > > > > > > >> > > to > > > > > > > > > >> > > > > > follow > > > > > > > > > >> > > > > > > > the > > > > > > > > > >> > > > > > > > > > > complex path described in the KIP? * > > > > > > > > > >> > > > > > > > > > > --> We suggest that every consumer > did > > > > > these 2 > > > > > > > > steps > > > > > > > > > >> of > > > > > > > > > >> > > > rolling > > > > > > > > > >> > > > > > > > > upgrade. > > > > > > > > > >> > > > > > > > > > > And after KAFKA-12477 < > > > > > > > > > >> > > > > > > > > > > > > > https://issues.apache.org/jira/browse/KAFKA-12477 > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > is completed, it can be reduced to 1 > > > > rolling > > > > > > > > > upgrade. > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > *2. what happens if they don't read > > the > > > > > > > > instructions > > > > > > > > > >> and > > > > > > > > > >> > > > > upgrade > > > > > > > > > >> > > > > > as > > > > > > > > > >> > > > > > > > > they > > > > > > > > > >> > > > > > > > > > > have in the past?* > > > > > > > > > >> > > > > > > > > > > --> The reason we want 2 steps of > > > rolling > > > > > > > upgrade > > > > > > > > is > > > > > > > > > >> that > > > > > > > > > >> > > we > > > > > > > > > >> > > > > want > > > > > > > > > >> > > > > > > to > > > > > > > > > >> > > > > > > > > > avoid > > > > > > > > > >> > > > > > > > > > > the situation where leader is on old > > > > > byte-code > > > > > > > and > > > > > > > > > >> only > > > > > > > > > >> > > > > recognize > > > > > > > > > >> > > > > > > > > > "eager", > > > > > > > > > >> > > > > > > > > > > but due to compatibility would still > > be > > > > able > > > > > > to > > > > > > > > > >> > deserialize > > > > > > > > > >> > > > the > > > > > > > > > >> > > > > > new > > > > > > > > > >> > > > > > > > > > > protocol data from newer versioned > > > > members, > > > > > > and > > > > > > > > > hence > > > > > > > > > >> > just > > > > > > > > > >> > > go > > > > > > > > > >> > > > > > ahead > > > > > > > > > >> > > > > > > > and > > > > > > > > > >> > > > > > > > > > do > > > > > > > > > >> > > > > > > > > > > the assignment while new versioned > > > members > > > > > did > > > > > > > not > > > > > > > > > >> revoke > > > > > > > > > >> > > > their > > > > > > > > > >> > > > > > > > > > partitions > > > > > > > > > >> > > > > > > > > > > before joining the group. > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > But I'd say, the new default > assignor > > > > > > > > > >> > > > > "CooperativeStickyAssignor" > > > > > > > > > >> > > > > > > was > > > > > > > > > >> > > > > > > > > > > already introduced in V2.4.0, and it > > > > should > > > > > be > > > > > > > > long > > > > > > > > > >> > enough > > > > > > > > > >> > > > for > > > > > > > > > >> > > > > > user > > > > > > > > > >> > > > > > > > to > > > > > > > > > >> > > > > > > > > > > upgrade to the new byte-code to > > > recognize > > > > > the > > > > > > > > > >> > "cooperative" > > > > > > > > > >> > > > > > > protocol. > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > What do you think? > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > Thank you. > > > > > > > > > >> > > > > > > > > > > Luke > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > On Mon, Mar 29, 2021 at 12:14 PM > > Ismael > > > > > Juma < > > > > > > > > > >> > > > > ism...@juma.me.uk> > > > > > > > > > >> > > > > > > > > wrote: > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > Thanks for the KIP. Are we saying > > that > > > > > every > > > > > > > > > >> consumer > > > > > > > > > >> > > > > upgraded > > > > > > > > > >> > > > > > > > would > > > > > > > > > >> > > > > > > > > > have > > > > > > > > > >> > > > > > > > > > > > to follow the complex path > described > > > in > > > > > the > > > > > > > KIP? > > > > > > > > > >> Also, > > > > > > > > > >> > > what > > > > > > > > > >> > > > > > > happens > > > > > > > > > >> > > > > > > > > if > > > > > > > > > >> > > > > > > > > > > they > > > > > > > > > >> > > > > > > > > > > > don't read the instructions and > > > upgrade > > > > as > > > > > > > they > > > > > > > > > >> have in > > > > > > > > > >> > > the > > > > > > > > > >> > > > > > past? > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > Ismael > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > On Fri, Mar 26, 2021, 1:53 AM Luke > > > Chen > > > > < > > > > > > > > > >> > > show...@gmail.com > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > wrote: > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > Hi everyone, > > > > > > > > > >> > > > > > > > > > > > > <Update the subject> > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > I'd like to discuss the > following > > > > > proposal > > > > > > > to > > > > > > > > > make > > > > > > > > > >> > the > > > > > > > > > >> > > > > > > > > > > > > CooperativeStickyAssignor as the > > > > default > > > > > > > > > assignor. > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-726%3A+Make+the+Cooperativ > > > eStickyAssignor+as+the+default+assignor > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > Any comments are welcomed. > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > Thank you. > > > > > > > > > >> > > > > > > > > > > > > Luke > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > -- > > > > > > > > > >> > > > > > > -- Guozhang > > > > > > > > > >> > > > > > > > > > > > > > > > >> > > > > > > > > > > > > > > >> > > > > > > > > > > > > > >> > > > > > > > > > > > > >> > > > > > > > > > > > >> > > > > > > > > > > >> > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > > > -- Guozhang > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > > > -- Guozhang > > > > > > > > > > > > > > > > > > > > > > > > > > >