Hi Sophie & Ismael, Thank you for your feedback. No problem, let's pause this KIP and wait for this improvement: KAFKA-12477 <https://issues.apache.org/jira/browse/KAFKA-12477>.
Stay tuned :) Thank you. Luke On Tue, Mar 30, 2021 at 3:14 AM Ismael Juma <ism...@juma.me.uk> wrote: > Hi Sophie, > > I didn't analyze the KIP in detail, but the two suggestions you mentioned > sound like great improvements. > > A bit more context: breaking changes for a widely used product like Kafka > are costly and hence why we try as hard as we can to avoid them. When it > comes to the brokers, they are often managed by a central group (or they're > in the Cloud), so they're a bit easier to manage. Even so, it's still > possible to upgrade from 0.8.x directly to 2.7 since all protocol versions > are still supported. When it comes to the basic clients (producer, > consumer, admin client), they're often embedded in applications so we have > to be even more conservative. > > Ismael > > On Mon, Mar 29, 2021 at 10:50 AM Sophie Blee-Goldman > <sop...@confluent.io.invalid> wrote: > > > Ismael, > > > > It seems like given 3.0 is a breaking release, we have to rely on users > > being aware of this and responsible > > enough to read the upgrade guide. Otherwise we could never ever make any > > breaking changes beyond just > > removing deprecated APIs or other compilation-breaking errors that would > be > > immediately visible, no? > > > > That said, obviously it's better to have a circuit-breaker that will fail > > fast in case of a user misconfiguration > > rather than silently corrupting the consumer group state -- eg for two > > consumers to overlap in their ownership > > of the same partition(s). We could definitely implement this, and now > that > > I think about it this might solve a > > related problem in KAFKA-12477 > > <https://issues.apache.org/jira/browse/KAFKA-12477>. We just add a new > > field to the Assignment in which the group leader > > indicates whether it's on a recent enough version to understand > cooperative > > rebalancing. If an upgraded member > > joins the group, it'll only be allowed to start following the new > > rebalancing protocol after receiving the go-ahead > > from the group leader. > > > > If we do go ahead and add this new field in the Assignment then I'm > pretty > > confident we can reduce the number > > of required rolling bounces to just one with KAFKA-12477 > > <https://issues.apache.org/jira/browse/KAFKA-12477>. In that case we > > should > > be in much better shape to > > feel good about changing the default to the CooperativeStickyAssignor. > How > > does that sound? > > > > To be clear, I'm not proposing we do this as part of KIP-726. Here's my > > take: > > > > Let's pause this KIP while I work on making these two improvements in > > KAFKA-12477 <https://issues.apache.org/jira/browse/KAFKA-12477>. Once I > > can > > confirm the > > short-circuit and single rolling bounce will be available for 3.0, I'll > > report back on this thread. Then we can move > > forward with this KIP again. > > > > Thoughts? > > Sophie > > > > On Mon, Mar 29, 2021 at 12:01 AM Luke Chen <show...@gmail.com> wrote: > > > > > Hi Ismael, > > > Thanks for your good question. Answer them below: > > > *1. Are we saying that every consumer upgraded would have to follow the > > > complex path described in the KIP? * > > > --> We suggest that every consumer did these 2 steps of rolling > upgrade. > > > And after KAFKA-12477 < > https://issues.apache.org/jira/browse/KAFKA-12477 > > > > > > is completed, it can be reduced to 1 rolling upgrade. > > > > > > *2. what happens if they don't read the instructions and upgrade as > they > > > have in the past?* > > > --> The reason we want 2 steps of rolling upgrade is that we want to > > avoid > > > the situation where leader is on old byte-code and only recognize > > "eager", > > > but due to compatibility would still be able to deserialize the new > > > protocol data from newer versioned members, and hence just go ahead and > > do > > > the assignment while new versioned members did not revoke their > > partitions > > > before joining the group. > > > > > > But I'd say, the new default assignor "CooperativeStickyAssignor" was > > > already introduced in V2.4.0, and it should be long enough for user to > > > upgrade to the new byte-code to recognize the "cooperative" protocol. > > > > > > What do you think? > > > > > > Thank you. > > > Luke > > > > > > On Mon, Mar 29, 2021 at 12:14 PM Ismael Juma <ism...@juma.me.uk> > wrote: > > > > > > > Thanks for the KIP. Are we saying that every consumer upgraded would > > have > > > > to follow the complex path described in the KIP? Also, what happens > if > > > they > > > > don't read the instructions and upgrade as they have in the past? > > > > > > > > Ismael > > > > > > > > On Fri, Mar 26, 2021, 1:53 AM Luke Chen <show...@gmail.com> wrote: > > > > > > > > > Hi everyone, > > > > > <Update the subject> > > > > > > > > > > I'd like to discuss the following proposal to make the > > > > > CooperativeStickyAssignor as the default assignor. > > > > > > > > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-726%3A+Make+the+CooperativeStickyAssignor+as+the+default+assignor > > > > > > > > > > Any comments are welcomed. > > > > > > > > > > Thank you. > > > > > Luke > > > > > > > > > > > > > > >