I have the same initial response as Ismael re: broker vs consumer settings.
The global setting seems questionable.

Could we maybe summarize what the impact of making this a client config
would be? Protocol bump is obvious, but is there any other significant
issue? For the protocol bump in particular, I think this change is
currently really critical for streams; it will be valuable elsewhere, but
the immediate demand is streams, so a protocol bump while being backwards
compatible wouldn't affect any other clients. Is this still actually
compatible with different clients given that they would now expect
different timeouts? (I think it's strictly compatible if you wait for
responses, but if you enforce any client side timeouts, I'm not so sure.)

re: test plan, I'm sure this will come as a surprise, but is the system
test even necessary? Validating # of rebalances seems messy as other things
can cause rebalances (though admittedly not in a "clean" case). But really
it seems like an integration test could validate this by making sure only 1
rebalance occurred when 2 members joined with a sufficient time gap.

-Ewen

On Thu, Mar 23, 2017 at 3:53 PM, Matthias J. Sax <matth...@confluent.io>
wrote:

> Thanks for the KIP Damian!
>
> My two cents:
>
>  - we should have an explicit parameter for this -- implicit setting are
> always tricky (the "importance" of this parameter would be LOW)
>
>  - the config should be different for each consumer group:
>    * assume you have a stateless app, you want to rebalance immediately
>    * if you start-up in an visualized environment using some tools like
> Mesos you might need a different value that on bare metal (no VM to be
> started)
>    * it also depends, how many consumer instanced you expect -- it's
> harder to start up 100 instances in 3 seconds than 5
>
>  - the default value should be zero
>
>
> One more thought: what about scaling scenarios? If a consumer group has
> 10 instanced and should be scaled up to 20, it would make sense to do
> this with a single rebalance, too. Thus, I am wondering, if it would
> make sense to apply this delay each time a new consumer joins group,
> even if the group is not empty?
>
>
> -Matthias
>
>
> On 3/23/17 10:19 AM, Damian Guy wrote:
> > Thanks Gouzhang - i think another problem with this is that is
> overloading
> > session.timeout.ms to mean multiple things. I'm not sure that is a good
> > thing.
> >
> > On Thu, 23 Mar 2017 at 17:14 Guozhang Wang <wangg...@gmail.com> wrote:
> >
> >> The downside of it, though, is that although it "hides" this from most
> of
> >> the users needing to be aware of it, by default session timeout i.e. the
> >> rebalance timeout is 10 seconds which could arguably too long.
> >>
> >>
> >> Guozhang
> >>
> >> On Thu, Mar 23, 2017 at 10:12 AM, Guozhang Wang <wangg...@gmail.com>
> >> wrote:
> >>
> >>> Just throwing another alternative idea here: we can consider using the
> >>> rebalance timeout value which is already included in the join request
> >>> protocol (and on the current Java client it is always written as the
> >>> session timeout value), that the first member joining will always force
> >> the
> >>> coordinator to wait that long. By doing this we do not need to bump up
> >> the
> >>> protocol either.
> >>>
> >>>
> >>> Guozhang
> >>>
> >>> On Thu, Mar 23, 2017 at 5:49 AM, Damian Guy <damian....@gmail.com>
> >> wrote:
> >>>
> >>>> Hi Ismael,
> >>>>
> >>>> Mostly to avoid the protocol bump.
> >>>>
> >>>> I agree that it may be difficult to choose the right delay for all
> >>>> consumer
> >>>> groups, but we wanted to make this something that most users don't
> >> really
> >>>> need to think about, i.e., a small enough default delay that works in
> >> the
> >>>> majority of cases. However it would be much more flexible as a
> consumer
> >>>> config, which i'm happy to pursue if this change is worthy of a
> protocol
> >>>> bump.
> >>>>
> >>>> Thanks,
> >>>> Damian
> >>>>
> >>>> On Thu, 23 Mar 2017 at 12:35 Ismael Juma <ism...@juma.me.uk> wrote:
> >>>>
> >>>>> Thanks for the KIP, Damian. It makes sense to avoid multiple
> >> rebalances
> >>>>> during start-up. One issue with having this as a broker config is
> that
> >>>> it
> >>>>> may be difficult to choose the right delay for all consumer groups.
> >> Can
> >>>> you
> >>>>> elaborate a little more on why the first alternative (add a consumer
> >>>>> config) was rejected? We bump protocol versions regularly (when it
> >> makes
> >>>>> sense), so it would be good to get a bit more detail.
> >>>>>
> >>>>> Thanks,
> >>>>> Ismael
> >>>>>
> >>>>> On Thu, Mar 23, 2017 at 12:24 PM, Damian Guy <damian....@gmail.com>
> >>>> wrote:
> >>>>>
> >>>>>> Hi All,
> >>>>>>
> >>>>>> I've prepared a KIP to add a configurable delay to the initial
> >>>> consumer
> >>>>>> group rebalance.
> >>>>>>
> >>>>>> Please have look here:
> >>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> >>>>>> 134%3A+Delay+initial+consumer+group+rebalance
> >>>>>>
> >>>>>> Thanks,
> >>>>>> Damian
> >>>>>>
> >>>>>> BTW, i apologize if this appears twice. Seems the first one may have
> >>>> not
> >>>>>> made it.
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>>
> >>>
> >>> --
> >>> -- Guozhang
> >>>
> >>
> >>
> >>
> >> --
> >> -- Guozhang
> >>
> >
>
>

Reply via email to