Thanks for the KIP Damian! My two cents:
- we should have an explicit parameter for this -- implicit setting are always tricky (the "importance" of this parameter would be LOW) - the config should be different for each consumer group: * assume you have a stateless app, you want to rebalance immediately * if you start-up in an visualized environment using some tools like Mesos you might need a different value that on bare metal (no VM to be started) * it also depends, how many consumer instanced you expect -- it's harder to start up 100 instances in 3 seconds than 5 - the default value should be zero One more thought: what about scaling scenarios? If a consumer group has 10 instanced and should be scaled up to 20, it would make sense to do this with a single rebalance, too. Thus, I am wondering, if it would make sense to apply this delay each time a new consumer joins group, even if the group is not empty? -Matthias On 3/23/17 10:19 AM, Damian Guy wrote: > Thanks Gouzhang - i think another problem with this is that is overloading > session.timeout.ms to mean multiple things. I'm not sure that is a good > thing. > > On Thu, 23 Mar 2017 at 17:14 Guozhang Wang <wangg...@gmail.com> wrote: > >> The downside of it, though, is that although it "hides" this from most of >> the users needing to be aware of it, by default session timeout i.e. the >> rebalance timeout is 10 seconds which could arguably too long. >> >> >> Guozhang >> >> On Thu, Mar 23, 2017 at 10:12 AM, Guozhang Wang <wangg...@gmail.com> >> wrote: >> >>> Just throwing another alternative idea here: we can consider using the >>> rebalance timeout value which is already included in the join request >>> protocol (and on the current Java client it is always written as the >>> session timeout value), that the first member joining will always force >> the >>> coordinator to wait that long. By doing this we do not need to bump up >> the >>> protocol either. >>> >>> >>> Guozhang >>> >>> On Thu, Mar 23, 2017 at 5:49 AM, Damian Guy <damian....@gmail.com> >> wrote: >>> >>>> Hi Ismael, >>>> >>>> Mostly to avoid the protocol bump. >>>> >>>> I agree that it may be difficult to choose the right delay for all >>>> consumer >>>> groups, but we wanted to make this something that most users don't >> really >>>> need to think about, i.e., a small enough default delay that works in >> the >>>> majority of cases. However it would be much more flexible as a consumer >>>> config, which i'm happy to pursue if this change is worthy of a protocol >>>> bump. >>>> >>>> Thanks, >>>> Damian >>>> >>>> On Thu, 23 Mar 2017 at 12:35 Ismael Juma <ism...@juma.me.uk> wrote: >>>> >>>>> Thanks for the KIP, Damian. It makes sense to avoid multiple >> rebalances >>>>> during start-up. One issue with having this as a broker config is that >>>> it >>>>> may be difficult to choose the right delay for all consumer groups. >> Can >>>> you >>>>> elaborate a little more on why the first alternative (add a consumer >>>>> config) was rejected? We bump protocol versions regularly (when it >> makes >>>>> sense), so it would be good to get a bit more detail. >>>>> >>>>> Thanks, >>>>> Ismael >>>>> >>>>> On Thu, Mar 23, 2017 at 12:24 PM, Damian Guy <damian....@gmail.com> >>>> wrote: >>>>> >>>>>> Hi All, >>>>>> >>>>>> I've prepared a KIP to add a configurable delay to the initial >>>> consumer >>>>>> group rebalance. >>>>>> >>>>>> Please have look here: >>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP- >>>>>> 134%3A+Delay+initial+consumer+group+rebalance >>>>>> >>>>>> Thanks, >>>>>> Damian >>>>>> >>>>>> BTW, i apologize if this appears twice. Seems the first one may have >>>> not >>>>>> made it. >>>>>> >>>>> >>>> >>> >>> >>> >>> -- >>> -- Guozhang >>> >> >> >> >> -- >> -- Guozhang >> >
signature.asc
Description: OpenPGP digital signature