@James What you described is true: the transition from dynamic to static memberships are not thought through yet. But I do not think it is an impossible problem: note that we indeed moved the offset commit from ZK to kafka coordinator in 0.8.2 :) The migration plan is to first to double-commits on both zk and coordinator, and then do a second round to turn the zk off.
So just to throw a wild idea here: also following a two-rolling-bounce manner, in the JoinGroupRequest we can set the flag to "static" while keep the registry-id field empty still, in this case, the coordinator still follows the logic of "dynamic", accepting the request while allowing the protocol to be set to "static"; after the first rolling bounce, the group protocol is already "static", then a second rolling bounce is triggered and this time we set the registry-id. Guozhang On Tue, Aug 7, 2018 at 1:19 AM, James Cheng <wushuja...@gmail.com> wrote: > Guozhang, in a previous message, you proposed said this: > > > On Jul 30, 2018, at 3:56 PM, Guozhang Wang <wangg...@gmail.com> wrote: > > > > 1. We bump up the JoinGroupRequest with additional fields: > > > > 1.a) a flag indicating "static" or "dynamic" membership protocols. > > 1.b) with "static" membership, we also add the pre-defined member id. > > 1.c) with "static" membership, we also add an optional > > "group-change-timeout" value. > > > > 2. On the broker side, we enforce only one of the two protocols for all > > group members: we accept the protocol on the first joined member of the > > group, and if later joining members indicate a different membership > > protocol, we reject it. If the group-change-timeout value was different > to > > the first joined member, we reject it as well. > > > What will happen if we have an already-deployed application that wants to > switch to using static membership? Let’s say there are 10 instances of it. > As the instances go through a rolling restart, they will switch from > dynamic membership (the default?) to static membership. As each one leaves > the group and restarts, they will be rejected from the group (because the > group is currently using dynamic membership). The group will shrink down > until there is 1 node handling all the traffic. After that one restarts, > the group will switch over to static membership. > > Is that right? That means that the transition plan from dynamic to static > membership isn’t very smooth. > > I’m not really sure what can be done in this case. This reminds me of the > transition plans that were discussed for moving from zookeeper-based > consumers to kafka-coordinator-based consumers. That was also hard, and > ultimately we decided not to build that. > > -James > > -- -- Guozhang