Re: [VOTE]KIP-966: Eligible Leader Replicas

2023-10-11 Thread Calvin Liu
Hi,
The KIP has received 3 binding votes from Justine Olshan, Jun Rao and Colin
McCabe.
Also thanks to Jeff Kim, Jack Vanlightly, Artem Livshits, David Arthur,
David Jacot for the comments to make the KIP better.
Thanks for the help!

On Mon, Oct 9, 2023 at 10:20 AM Calvin Liu  wrote:

> Hi Colin,
> Thanks for the feedback. I have updated the KIP but with the following
> changes.
> --The request is still grouped by topics. For each topic, the caller can
> specify either partition IDs or a range of partitions to query. If it is a
> range query, the request should specify the "first partition id", then the
> partition larger or equal to the id will be returned.
> --The response is still grouped by topics. When quota limit reached:
>  If it is a range request, the "next partition id" will be specified
> within the topic when a partial of partitions can be returned. When the
> whole topic can't be returned, the topic will have the
> error REQUEST_LIMIT_REACHED.
>  If it is a partition-specific topic, all the partition can't be
> returned will have error REQUEST_LIMIT_REACHED
> Note, that the request can have partition-specific and range-request topic
> mixed.
>
> On Fri, Oct 6, 2023 at 4:30 PM Colin McCabe  wrote:
>
>> Hi Calvin,
>>
>> Thanks for the KIP. I think the config discussion was good and I have no
>> more comments there.
>>
>> I have one last thing I think we should fix up:
>>
>> I think we should improve DescribeTopicRequest. The current mechanism of
>> "you can only list 20 topics" doesn't do a very good job of limiting the
>> results. After all, if those topics only have 1 partition each, this means
>> a pretty small RPC. If they have 10,000 partitions each, then it's a very
>> large RPC.
>>
>> I think a better mechanism would be:
>> 1. Have the request be a list of (topic_name, partition_id) pairs plus a
>> (first_topic_name, first_partition_id) pair.
>> (for the initial request, first_topic_name="" and first_partition_id=-1,
>> of course)
>> (if partition_id = -1 then we should list all partitions for the topic)
>>
>> 2. When returning results, sort everything alphabetically and return the
>> first 1000, plus a (next_topic, next_partition_id) pair. (if there is
>> nothing more to return, next_topic = null.)
>>
>> With those changes I would be +1
>>
>> best,
>> Colin
>>
>>
>> If the response wasn't long enough, the caller can set
>> On Wed, Oct 4, 2023, at 17:44, Jun Rao wrote:
>> > Hi, Calvin,
>> >
>> > Thanks for the KIP. +1 from me too.
>> >
>> > Jun
>> >
>> > On Wed, Sep 20, 2023 at 5:28 PM Justine Olshan
>> 
>> > wrote:
>> >
>> >> Thanks Calvin.
>> >> I think this will be very helpful going forward to minimize data loss.
>> >>
>> >> +1 from me (binding)
>> >>
>> >> Justine
>> >>
>> >> On Wed, Sep 20, 2023 at 3:42 PM Calvin Liu > >
>> >> wrote:
>> >>
>> >> > Hi all,
>> >> > I'd like to call for a vote on KIP-966 which includes a series of
>> >> > enhancements to the current ISR model.
>> >> >
>> >> >- Introduce the new HWM advancement requirement which enables the
>> >> system
>> >> >to have more potentially data-safe replicas.
>> >> >- Introduce Eligible Leader Replicas(ELR) to represent the above
>> >> >data-safe replicas.
>> >> >- Introduce Unclean Recovery process which will deterministically
>> >> choose
>> >> >the best replica during an unclean leader election.
>> >> >
>> >> >
>> >> > KIP:
>> >> >
>> >> >
>> >>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-966%3A+Eligible+Leader+Replicas
>> >> >
>> >> > Discussion thread:
>> >> > https://lists.apache.org/thread/gpbpx9kpd7c62dm962h6kww0ghgznb38
>> >> >
>> >>
>>
>


Re: [VOTE]KIP-966: Eligible Leader Replicas

2023-10-09 Thread Calvin Liu
Hi Colin,
Thanks for the feedback. I have updated the KIP but with the following
changes.
--The request is still grouped by topics. For each topic, the caller can
specify either partition IDs or a range of partitions to query. If it is a
range query, the request should specify the "first partition id", then the
partition larger or equal to the id will be returned.
--The response is still grouped by topics. When quota limit reached:
 If it is a range request, the "next partition id" will be specified
within the topic when a partial of partitions can be returned. When the
whole topic can't be returned, the topic will have the
error REQUEST_LIMIT_REACHED.
 If it is a partition-specific topic, all the partition can't be
returned will have error REQUEST_LIMIT_REACHED
Note, that the request can have partition-specific and range-request topic
mixed.

On Fri, Oct 6, 2023 at 4:30 PM Colin McCabe  wrote:

> Hi Calvin,
>
> Thanks for the KIP. I think the config discussion was good and I have no
> more comments there.
>
> I have one last thing I think we should fix up:
>
> I think we should improve DescribeTopicRequest. The current mechanism of
> "you can only list 20 topics" doesn't do a very good job of limiting the
> results. After all, if those topics only have 1 partition each, this means
> a pretty small RPC. If they have 10,000 partitions each, then it's a very
> large RPC.
>
> I think a better mechanism would be:
> 1. Have the request be a list of (topic_name, partition_id) pairs plus a
> (first_topic_name, first_partition_id) pair.
> (for the initial request, first_topic_name="" and first_partition_id=-1,
> of course)
> (if partition_id = -1 then we should list all partitions for the topic)
>
> 2. When returning results, sort everything alphabetically and return the
> first 1000, plus a (next_topic, next_partition_id) pair. (if there is
> nothing more to return, next_topic = null.)
>
> With those changes I would be +1
>
> best,
> Colin
>
>
> If the response wasn't long enough, the caller can set
> On Wed, Oct 4, 2023, at 17:44, Jun Rao wrote:
> > Hi, Calvin,
> >
> > Thanks for the KIP. +1 from me too.
> >
> > Jun
> >
> > On Wed, Sep 20, 2023 at 5:28 PM Justine Olshan
> 
> > wrote:
> >
> >> Thanks Calvin.
> >> I think this will be very helpful going forward to minimize data loss.
> >>
> >> +1 from me (binding)
> >>
> >> Justine
> >>
> >> On Wed, Sep 20, 2023 at 3:42 PM Calvin Liu 
> >> wrote:
> >>
> >> > Hi all,
> >> > I'd like to call for a vote on KIP-966 which includes a series of
> >> > enhancements to the current ISR model.
> >> >
> >> >- Introduce the new HWM advancement requirement which enables the
> >> system
> >> >to have more potentially data-safe replicas.
> >> >- Introduce Eligible Leader Replicas(ELR) to represent the above
> >> >data-safe replicas.
> >> >- Introduce Unclean Recovery process which will deterministically
> >> choose
> >> >the best replica during an unclean leader election.
> >> >
> >> >
> >> > KIP:
> >> >
> >> >
> >>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-966%3A+Eligible+Leader+Replicas
> >> >
> >> > Discussion thread:
> >> > https://lists.apache.org/thread/gpbpx9kpd7c62dm962h6kww0ghgznb38
> >> >
> >>
>


Re: [VOTE]KIP-966: Eligible Leader Replicas

2023-10-06 Thread Colin McCabe
Hi Calvin,

Thanks for the KIP. I think the config discussion was good and I have no more 
comments there.

I have one last thing I think we should fix up:

I think we should improve DescribeTopicRequest. The current mechanism of "you 
can only list 20 topics" doesn't do a very good job of limiting the results. 
After all, if those topics only have 1 partition each, this means a pretty 
small RPC. If they have 10,000 partitions each, then it's a very large RPC.

I think a better mechanism would be:
1. Have the request be a list of (topic_name, partition_id) pairs plus a 
(first_topic_name, first_partition_id) pair.
(for the initial request, first_topic_name="" and first_partition_id=-1, of 
course)
(if partition_id = -1 then we should list all partitions for the topic)

2. When returning results, sort everything alphabetically and return the first 
1000, plus a (next_topic, next_partition_id) pair. (if there is nothing more to 
return, next_topic = null.)

With those changes I would be +1

best,
Colin


If the response wasn't long enough, the caller can set 
On Wed, Oct 4, 2023, at 17:44, Jun Rao wrote:
> Hi, Calvin,
>
> Thanks for the KIP. +1 from me too.
>
> Jun
>
> On Wed, Sep 20, 2023 at 5:28 PM Justine Olshan 
> wrote:
>
>> Thanks Calvin.
>> I think this will be very helpful going forward to minimize data loss.
>>
>> +1 from me (binding)
>>
>> Justine
>>
>> On Wed, Sep 20, 2023 at 3:42 PM Calvin Liu 
>> wrote:
>>
>> > Hi all,
>> > I'd like to call for a vote on KIP-966 which includes a series of
>> > enhancements to the current ISR model.
>> >
>> >- Introduce the new HWM advancement requirement which enables the
>> system
>> >to have more potentially data-safe replicas.
>> >- Introduce Eligible Leader Replicas(ELR) to represent the above
>> >data-safe replicas.
>> >- Introduce Unclean Recovery process which will deterministically
>> choose
>> >the best replica during an unclean leader election.
>> >
>> >
>> > KIP:
>> >
>> >
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-966%3A+Eligible+Leader+Replicas
>> >
>> > Discussion thread:
>> > https://lists.apache.org/thread/gpbpx9kpd7c62dm962h6kww0ghgznb38
>> >
>>


Re: [VOTE]KIP-966: Eligible Leader Replicas

2023-10-04 Thread Jun Rao
Hi, Calvin,

Thanks for the KIP. +1 from me too.

Jun

On Wed, Sep 20, 2023 at 5:28 PM Justine Olshan 
wrote:

> Thanks Calvin.
> I think this will be very helpful going forward to minimize data loss.
>
> +1 from me (binding)
>
> Justine
>
> On Wed, Sep 20, 2023 at 3:42 PM Calvin Liu 
> wrote:
>
> > Hi all,
> > I'd like to call for a vote on KIP-966 which includes a series of
> > enhancements to the current ISR model.
> >
> >- Introduce the new HWM advancement requirement which enables the
> system
> >to have more potentially data-safe replicas.
> >- Introduce Eligible Leader Replicas(ELR) to represent the above
> >data-safe replicas.
> >- Introduce Unclean Recovery process which will deterministically
> choose
> >the best replica during an unclean leader election.
> >
> >
> > KIP:
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-966%3A+Eligible+Leader+Replicas
> >
> > Discussion thread:
> > https://lists.apache.org/thread/gpbpx9kpd7c62dm962h6kww0ghgznb38
> >
>


Re: [VOTE]KIP-966: Eligible Leader Replicas

2023-09-20 Thread Justine Olshan
Thanks Calvin.
I think this will be very helpful going forward to minimize data loss.

+1 from me (binding)

Justine

On Wed, Sep 20, 2023 at 3:42 PM Calvin Liu 
wrote:

> Hi all,
> I'd like to call for a vote on KIP-966 which includes a series of
> enhancements to the current ISR model.
>
>- Introduce the new HWM advancement requirement which enables the system
>to have more potentially data-safe replicas.
>- Introduce Eligible Leader Replicas(ELR) to represent the above
>data-safe replicas.
>- Introduce Unclean Recovery process which will deterministically choose
>the best replica during an unclean leader election.
>
>
> KIP:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-966%3A+Eligible+Leader+Replicas
>
> Discussion thread:
> https://lists.apache.org/thread/gpbpx9kpd7c62dm962h6kww0ghgznb38
>