Thanks for the answers and updates Sean! On Mon, Jan 26, 2026 at 7:43 AM Lucas Brutschy via dev <[email protected]> wrote:
> Hi Sean, > > that makes a lot of sense, thanks for the explanation! > > Cheers, > Lucas > > On Sat, Jan 24, 2026 at 11:17 AM Sean Quah via dev <[email protected]> > wrote: > > > > Hi Lucas, > > > > LB01: I'm just wondering if it would have been an option to instead > > > update the target assignment to remove the partitions from the member > > > immediately when the member unsubscribes? > > > > Good question. I can't recall my thought process when implementing > > KAFKA-19431. Three points spring to mind with updating the target > > assignment on subscription changes: > > 1. I think I wanted to preserve the property that the target assignment > is > > only updated by assignment runs and immutable for a given epoch. Though > it > > turns out that's not actually the case since we do patch the target > > assignment when members leave or static members are replaced. > > 2. An offloaded assignment based on older subscriptions can complete > right > > after we patch the target assignment to remove unsubscribed topics so we > > would need to do some extra filtering on assignor completion. > > 3. Even if we update the target assignment, we would need to touch the > > reconciliation process anyway, since it wouldn't do anything when there > is > > no epoch bump. > > > > There is certainly nothing stopping us from updating the target > assignment. > > I think it seemed cleaner at the time to keep it all in the > reconciliation > > process. > > > > Thanks, > > Sean > > > > On Fri, Jan 23, 2026 at 12:32 PM Lucas Brutschy <[email protected]> > > wrote: > > > > > Hey Sean, > > > > > > thanks for the KIP! This makes a lot of sense to me. I don't really > > > have anything I want you to change about the KIP. > > > > > > > We modify reconciliation to revoke any partitions the member is no > > > longer subscribed to, since the target assignment may lag behind member > > > subscriptions. > > > > > > LB01: I'm just wondering if it would have been an option to instead > > > update the target assignment to remove the partitions from the member > > > immediately when the member unsubscribes? > > > > > > Cheers, > > > Lucas > > > > > > On Thu, Jan 22, 2026 at 11:58 AM Sean Quah via dev < > [email protected]> > > > wrote: > > > > > > > > > > > > > > LM1: About group.initial.rebalance.delay.ms, I expect the > interaction > > > > > with the interval is just as described for the streams initial > delay > > > and > > > > > interval, correct? Should we clarify that in the KIP (it only > mentions > > > the > > > > > streams case) > > > > > > > > We haven't added a consumer or share group > initial.rebalance.delay.ms > > > > config yet. It only exists for streams right now. > > > > > > > > LM2: The KIP refers to batching assignment re-calculations triggered > by > > > > > member subscriptions changes, but I expect the batching mechanism > > > applies > > > > > the same when the assignment re-calculation is triggered by > metadata > > > > > changes (i.e topic/partition created or deleted), without any HB > > > changing > > > > > subscriptions. Is my understanding correct? > > > > > > > > Yes, that's right. Topic metadata changes also bump the group epoch > and > > > > triggers the same assignment flow. > > > > > > > > LM3: About this section: "When there is an in-flight assignor run > for the > > > > > group, there is no new target assignment. We will trigger the next > > > assignor > > > > > run on a future heartbeat.". I expect that the next assignor run > will > > > be > > > > > triggered on the next HB from this or from any other member of the > > > group, > > > > > received after the interval expires (without the members > re-sending the > > > > > subscription change). Is my expectation correct? If so, it may be > worth > > > > > clarifying in the KIP to avoid confusion with client-side > > > implementations. > > > > > > > > I tried to clarify in the KIP. Let me know your thoughts! > > > > > > > > On Thu, Jan 22, 2026 at 10:56 AM Sean Quah <[email protected]> > wrote: > > > > > > > > > dl01: Could we mention the handling when the group metadata or > > > > >> topic partition metadata is changed or deleted during the async > > > assignor > > > > >> run? > > > > > > > > > > Thanks! I've added a paragraph to the Assignment Offload section > > > > > describing the handling of group metadata changes. Topic metadata > > > changes > > > > > already bump the group epoch and we don't need to handle them > > > specially. > > > > > > > > > > dl02: This might be a question for the overall coordinator > executor - > > > do > > > > >> we have plans to apply an explicit size limit to the executor > queue? > > > If > > > > >> many groups trigger offloaded assignments simultaneously, should > we > > > apply > > > > >> some backpressure for protection? > > > > > > > > > > There aren't any plans for that right now. We actually don't have a > > > size > > > > > limit for the event processor queue either. > > > > > > > > > > On Thu, Jan 22, 2026 at 10:56 AM Sean Quah <[email protected]> > wrote: > > > > > > > > > >> Hi all, thanks for the feedback so far. > > > > >> > > > > >> dj01: In the proposed changes section, you state that the > timestamp of > > > > >>> the last assignment is not persisted. How do you plan to > bookkeep it > > > if it > > > > >>> is not stored with the assignment? Intuitively, I would add a > > > timestamp to > > > > >>> the assignment record. > > > > >> > > > > >> Thinking about it, it's easier to add it to the assignment > record. I > > > will > > > > >> update the KIP. One thing to note is that the timestamp will be > > > subject to > > > > >> rollbacks when writing to the log fails, so we can allow extra > > > assignment > > > > >> runs when that happens. > > > > >> > > > > >> dj02: I wonder whether we should also add a "thread idle ratio" > metric > > > > >>> for the group coordinator executor. What do you think? > > > > >> > > > > >> I think it could be useful so I've added it to the KIP. The > > > > >> implementation will have to be different to the event processor, > > > since we > > > > >> currently use an ExecutorService. > > > > >> > > > > >> dj03: If the executor is not used by the share coordinator, it > should > > > not > > > > >>> expose any metrics about it. Is it possible to remove them? > > > > >> > > > > >> I've removed them from the KIP. We can add a parameter to the > > > coordinator > > > > >> metrics class to control whether they are visible. > > > > >> > > > > >> dj04: Is having one group coordinator executor thread sufficient > by > > > > >>> default for common workloads? > > > > >> > > > > >> Yes and no. I expect it will be very difficult to overload an > entire > > > > >> thread, ie. submit work faster than it can complete it. But > updating > > > the > > > > >> default to two threads could be good for reducing delays due to > > > > >> simultaneous assignor runs. I've raised the default to 2 threads. > > > > >> > > > > >> dj05: It seems you propose enabling the minimum assignor interval > > > with a > > > > >>> default of 5 seconds. However, the offloading is not enabled by > > > default. Is > > > > >>> the first one enough to guarantee the stability of the group > > > coordinator? > > > > >>> How do you foresee enabling the second one in the future? It > would > > > be great > > > > >>> if you could address this in the KIP. We need a clear motivation > for > > > > >>> changing the default behavior and a plan for the future. > > > > >> > > > > >> I initially thought that offloading would increase rebalance > times by > > > 1 > > > > >> heartbeat and so didn't propose turning it on by default. But > after > > > some > > > > >> more thinking, I believe both features will increase rebalance > times > > > by 1 > > > > >> heartbeat interval and the increase shouldn't stack. The minimum > > > assignor > > > > >> interval only impacts groups with more than 2 members, while > > > offloading > > > > >> only impacts groups with a single member. This is because in the > other > > > > >> cases, the extra delays are folded into existing revocation + > > > heartbeat > > > > >> delays. Note that share groups have no revocation so always see > > > increased > > > > >> rebalance times. I've updated the KIP to add the analysis of > rebalance > > > > >> times and propose turning both features on by default. > > > > >> > > > > >> dj06: Based on its description, I wonder whether ` > > > > >>> consumer.min.assignor.interval.ms` should be called ` > > > > >>> consumer.min.assignment.interval.ms`. What do you think? > > > > >> > > > > >> Thanks, I've renamed the config options in the KIP. What about the > > > > >> assignor.offload.enable configs? > > > > >> > > > > >> dj07: It is not possible to enable/disable the offloading at the > group > > > > >>> level. This makes sense to me but it would be great to explain > the > > > > >>> rationale for it in the KIP. > > > > >> > > > > >> Thinking about it, there's nothing stopping us from configuring > > > > >> offloading at the group level. In fact it might be desirable for > some > > > users > > > > >> to disable offloading at the group coordinator level to keep > > > rebalances > > > > >> fast and only enable it for problematic large groups. I've added a > > > > >> group-level override to the KIP. > > > > >> > > > > >> On Tue, Jan 20, 2026 at 1:38 PM Lianet Magrans < > [email protected]> > > > > >> wrote: > > > > >> > > > > >>> Hi Sean, thanks for the KIP. > > > > >>> > > > > >>> LM1: About group.initial.rebalance.delay.ms, I expect the > > > interaction > > > > >>> with the interval is just as described for the streams initial > delay > > > and > > > > >>> interval, correct? Should we clarify that in the KIP (it only > > > mentions the > > > > >>> streams case) > > > > >>> > > > > >>> LM2: The KIP refers to batching assignment re-calculations > triggered > > > by > > > > >>> member subscriptions changes, but I expect the batching mechanism > > > applies > > > > >>> the same when the assignment re-calculation is triggered by > metadata > > > > >>> changes (i.e topic/partition created or deleted), without any HB > > > changing > > > > >>> subscriptions. Is my understanding correct? > > > > >>> > > > > >>> LM3: About this section: "*When there is an in-flight assignor > run > > > for > > > > >>> the group, there is no new target assignment. We will trigger the > > > next > > > > >>> assignor run on a future heartbeat.*". I expect that the next > > > assignor > > > > >>> run will be triggered on the next HB from this or from any other > > > member of > > > > >>> the group, received after the interval expires (without the > members > > > > >>> re-sending the subscription change). Is my expectation correct? > If > > > so, > > > > >>> it may be worth clarifying in the KIP to avoid confusion with > > > client-side > > > > >>> implementations. > > > > >>> > > > > >>> Thanks! > > > > >>> Lianet > > > > >>> > > > > >>> > > > > >>> > > > > >>> On Tue, Jan 13, 2026 at 1:23 AM Sean Quah via dev < > > > [email protected]> > > > > >>> wrote: > > > > >>> > > > > >>>> sq01: We also have to update the SyncGroup request handling to > only > > > > >>>> return > > > > >>>> REBALANCE_IN_PROGRESS when the member's epoch is behind the > target > > > > >>>> assignment epoch, not the group epoch. Thanks to Dongnuo for > > > pointing > > > > >>>> this > > > > >>>> out. > > > > >>>> > > > > >>>> On Thu, Jan 8, 2026 at 5:40 PM Dongnuo Lyu via dev < > > > > >>>> [email protected]> > > > > >>>> wrote: > > > > >>>> > > > > >>>> > Hi Sean, thanks for the KIP! I have a few questions as > follows. > > > > >>>> > > > > > >>>> > dl01: Could we mention the handling when the group metadata or > > > topic > > > > >>>> > partition metadata is changed or deleted during the async > assignor > > > > >>>> run? > > > > >>>> > > > > > >>>> > dl02: This might be a question for the overall coordinator > > > executor - > > > > >>>> do we > > > > >>>> > have plans to apply an explicit size limit to the executor > queue? > > > If > > > > >>>> many > > > > >>>> > groups trigger offloaded assignments simultaneously, should we > > > apply > > > > >>>> some > > > > >>>> > backpressure for protection? > > > > >>>> > > > > > >>>> > Also resonate with dj05, for small groups default ` > > > > >>>> > min.assignor.interval.ms` > > > > >>>> > to 5s might not be necessary, so not sure if we want to make > the > > > batch > > > > >>>> > assignment default. Or it might be good to have a per group > > > > >>>> enablement. > > > > >>>> > > > > > >>>> > Thanks > > > > >>>> > Dongnuo > > > > >>>> > > > > > >>>> > > > > >>> > > > >
