Thanks for the KIP Damian!

My two cents:

 - we should have an explicit parameter for this -- implicit setting are
always tricky (the "importance" of this parameter would be LOW)

 - the config should be different for each consumer group:
   * assume you have a stateless app, you want to rebalance immediately
   * if you start-up in an visualized environment using some tools like
Mesos you might need a different value that on bare metal (no VM to be
started)
   * it also depends, how many consumer instanced you expect -- it's
harder to start up 100 instances in 3 seconds than 5

 - the default value should be zero


One more thought: what about scaling scenarios? If a consumer group has
10 instanced and should be scaled up to 20, it would make sense to do
this with a single rebalance, too. Thus, I am wondering, if it would
make sense to apply this delay each time a new consumer joins group,
even if the group is not empty?


-Matthias


On 3/23/17 10:19 AM, Damian Guy wrote:
> Thanks Gouzhang - i think another problem with this is that is overloading
> session.timeout.ms to mean multiple things. I'm not sure that is a good
> thing.
> 
> On Thu, 23 Mar 2017 at 17:14 Guozhang Wang <wangg...@gmail.com> wrote:
> 
>> The downside of it, though, is that although it "hides" this from most of
>> the users needing to be aware of it, by default session timeout i.e. the
>> rebalance timeout is 10 seconds which could arguably too long.
>>
>>
>> Guozhang
>>
>> On Thu, Mar 23, 2017 at 10:12 AM, Guozhang Wang <wangg...@gmail.com>
>> wrote:
>>
>>> Just throwing another alternative idea here: we can consider using the
>>> rebalance timeout value which is already included in the join request
>>> protocol (and on the current Java client it is always written as the
>>> session timeout value), that the first member joining will always force
>> the
>>> coordinator to wait that long. By doing this we do not need to bump up
>> the
>>> protocol either.
>>>
>>>
>>> Guozhang
>>>
>>> On Thu, Mar 23, 2017 at 5:49 AM, Damian Guy <damian....@gmail.com>
>> wrote:
>>>
>>>> Hi Ismael,
>>>>
>>>> Mostly to avoid the protocol bump.
>>>>
>>>> I agree that it may be difficult to choose the right delay for all
>>>> consumer
>>>> groups, but we wanted to make this something that most users don't
>> really
>>>> need to think about, i.e., a small enough default delay that works in
>> the
>>>> majority of cases. However it would be much more flexible as a consumer
>>>> config, which i'm happy to pursue if this change is worthy of a protocol
>>>> bump.
>>>>
>>>> Thanks,
>>>> Damian
>>>>
>>>> On Thu, 23 Mar 2017 at 12:35 Ismael Juma <ism...@juma.me.uk> wrote:
>>>>
>>>>> Thanks for the KIP, Damian. It makes sense to avoid multiple
>> rebalances
>>>>> during start-up. One issue with having this as a broker config is that
>>>> it
>>>>> may be difficult to choose the right delay for all consumer groups.
>> Can
>>>> you
>>>>> elaborate a little more on why the first alternative (add a consumer
>>>>> config) was rejected? We bump protocol versions regularly (when it
>> makes
>>>>> sense), so it would be good to get a bit more detail.
>>>>>
>>>>> Thanks,
>>>>> Ismael
>>>>>
>>>>> On Thu, Mar 23, 2017 at 12:24 PM, Damian Guy <damian....@gmail.com>
>>>> wrote:
>>>>>
>>>>>> Hi All,
>>>>>>
>>>>>> I've prepared a KIP to add a configurable delay to the initial
>>>> consumer
>>>>>> group rebalance.
>>>>>>
>>>>>> Please have look here:
>>>>>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
>>>>>> 134%3A+Delay+initial+consumer+group+rebalance
>>>>>>
>>>>>> Thanks,
>>>>>> Damian
>>>>>>
>>>>>> BTW, i apologize if this appears twice. Seems the first one may have
>>>> not
>>>>>> made it.
>>>>>>
>>>>>
>>>>
>>>
>>>
>>>
>>> --
>>> -- Guozhang
>>>
>>
>>
>>
>> --
>> -- Guozhang
>>
> 

Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to