Re: [VOTE] KIP-570: Add leader epoch in StopReplicaRequest
I'm +1 for the change. I think the main advantage is that it simplifies the batching. More like the LeaderAndIsr/UpdateMetadata requests, the request grouping becomes less significant. I'm not sure how much of the benefit can be realized given the need to keep compatibility, but it still seems like a good improvement to the protocol. -Jason On Thu, Mar 26, 2020 at 6:25 AM David Jacot wrote: > I shouldn't have said "always", sorry. > > When a replica is deleted, either due to a topic deletion or a > reassignment, > the controller transitions the replica to the `OfflineReplica` state and > then to > the `ReplicaDeletionStarted` state. The first transition issues a > StopReplicaRequest > with DeletePartitions=false and the second transition issues a > StopReplicaRequest > with DeleatePartitions=true. The two operations are actually doing the same > except that the latter one deletes the replica. > > At the moment, the `ControllerRequestBatch` partitions the accumulated stop > replica requests and sends two requests. > > https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/ControllerChannelManager.scala#L554 > > Having a per-partition flag allows us to combine everything into one > request > here if the requests are sent within the same batch obviously. This won't > guarantee that all requests will be combined though because requests are > batched optimistically in the controller but it opens the door to improve > it in > the future. > > Best, > David > > > One side note: The batching of the two requests depends on how the > > On Wed, Mar 25, 2020 at 6:12 PM Ismael Juma wrote: > > > Is it really true that the controller always sends two requests? Aren't > the > > operations different (stop replica with delete versus stop replica > > without)? > > > > On Wed, Mar 25, 2020, 9:59 AM David Jacot wrote: > > > > > Hi all, > > > > > > I'd like to inform you that I have slightly changed the schema which > was > > > proposed > > > in the KIP. During the implementation, I have realized that the > proposed > > > schema > > > did not work. The new one reorganises how topics/partitions are stored. > > > > > > I'd like to amend the current KIP with the following: > > > > > > At the moment, the StopReplicaRequest has a top level field named > > > `DeletePartitions` > > > which indicates whether the partitions present in the request must be > > > deleted or not. > > > The downside of this is that the controller always ends up sending two > > > StopReplica > > > requests, one with DeletePartitions=true and one with > > > DeletePartitions=false. > > > > > > Instead, I'd like to add a per-partition DeletePartition field to > combine > > > everything in > > > one request. This will reduce the number of requests sent to each > broker > > > and also > > > increase the batching. I've already implemented it. > > > > > > I've already updated the schema in the KIP if you want to see it. I > will > > > update the > > > KIP itself if you agree with the amendment. > > > > > > What do you think? Does it sound reasonable? > > > > > > Best, > > > David > > > > > > On Fri, Mar 6, 2020 at 3:37 PM David Jacot > wrote: > > > > > > > Hi all, > > > > > > > > The vote has passed with +3 binding votes (Jason Gustafson, Gwen > > Shapira, > > > > Jun Rao). > > > > > > > > Thanks to everyone! > > > > > > > > Best, > > > > David > > > > > > > > On Wed, Mar 4, 2020 at 9:02 AM David Jacot > > wrote: > > > > > > > >> Hi Jun, > > > >> > > > >> You're right. I have noticed it while implementing it. I plan to > use a > > > >> default > > > >> value as a sentinel in the protocol (e.g. -2) to cover this case. > > > >> > > > >> David > > > >> > > > >> On Wed, Mar 4, 2020 at 3:18 AM Jun Rao wrote: > > > >> > > > >>> Hi, David, > > > >>> > > > >>> Thanks for the KIP. +1 from me too. Just one comment below. > > > >>> > > > >>> 1. Regarding the sentinel leader epoch to indicate topic deletion, > it > > > >>> seems > > > >>> that we need to use a different sentinel value to indicate that the > > > >>> leader > > > >>> epoch is not present when the controller is still on the old > version > > > >>> during > > > >>> upgrade. > > > >>> > > > >>> Jun > > > >>> > > > >>> On Mon, Mar 2, 2020 at 11:20 AM Gwen Shapira > > > wrote: > > > >>> > > > >>> > +1 > > > >>> > > > > >>> > On Mon, Feb 24, 2020, 2:16 AM David Jacot > > > wrote: > > > >>> > > > > >>> > > Hi all, > > > >>> > > > > > >>> > > I would like to start a vote on KIP-570: Add leader epoch in > > > >>> > > StopReplicaRequest > > > >>> > > > > > >>> > > The KIP is here: > > > >>> > > > > > >>> > > > > > >>> > > > > >>> > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-570%3A+Add+leader+epoch+in+StopReplicaRequest > > > >>> > > > > > >>> > > Thanks, > > > >>> > > David > > > >>> > > > > > >>> > > > > >>> > > > >> > > > > > >
Re: [VOTE] KIP-570: Add leader epoch in StopReplicaRequest
I shouldn't have said "always", sorry. When a replica is deleted, either due to a topic deletion or a reassignment, the controller transitions the replica to the `OfflineReplica` state and then to the `ReplicaDeletionStarted` state. The first transition issues a StopReplicaRequest with DeletePartitions=false and the second transition issues a StopReplicaRequest with DeleatePartitions=true. The two operations are actually doing the same except that the latter one deletes the replica. At the moment, the `ControllerRequestBatch` partitions the accumulated stop replica requests and sends two requests. https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/ControllerChannelManager.scala#L554 Having a per-partition flag allows us to combine everything into one request here if the requests are sent within the same batch obviously. This won't guarantee that all requests will be combined though because requests are batched optimistically in the controller but it opens the door to improve it in the future. Best, David One side note: The batching of the two requests depends on how the On Wed, Mar 25, 2020 at 6:12 PM Ismael Juma wrote: > Is it really true that the controller always sends two requests? Aren't the > operations different (stop replica with delete versus stop replica > without)? > > On Wed, Mar 25, 2020, 9:59 AM David Jacot wrote: > > > Hi all, > > > > I'd like to inform you that I have slightly changed the schema which was > > proposed > > in the KIP. During the implementation, I have realized that the proposed > > schema > > did not work. The new one reorganises how topics/partitions are stored. > > > > I'd like to amend the current KIP with the following: > > > > At the moment, the StopReplicaRequest has a top level field named > > `DeletePartitions` > > which indicates whether the partitions present in the request must be > > deleted or not. > > The downside of this is that the controller always ends up sending two > > StopReplica > > requests, one with DeletePartitions=true and one with > > DeletePartitions=false. > > > > Instead, I'd like to add a per-partition DeletePartition field to combine > > everything in > > one request. This will reduce the number of requests sent to each broker > > and also > > increase the batching. I've already implemented it. > > > > I've already updated the schema in the KIP if you want to see it. I will > > update the > > KIP itself if you agree with the amendment. > > > > What do you think? Does it sound reasonable? > > > > Best, > > David > > > > On Fri, Mar 6, 2020 at 3:37 PM David Jacot wrote: > > > > > Hi all, > > > > > > The vote has passed with +3 binding votes (Jason Gustafson, Gwen > Shapira, > > > Jun Rao). > > > > > > Thanks to everyone! > > > > > > Best, > > > David > > > > > > On Wed, Mar 4, 2020 at 9:02 AM David Jacot > wrote: > > > > > >> Hi Jun, > > >> > > >> You're right. I have noticed it while implementing it. I plan to use a > > >> default > > >> value as a sentinel in the protocol (e.g. -2) to cover this case. > > >> > > >> David > > >> > > >> On Wed, Mar 4, 2020 at 3:18 AM Jun Rao wrote: > > >> > > >>> Hi, David, > > >>> > > >>> Thanks for the KIP. +1 from me too. Just one comment below. > > >>> > > >>> 1. Regarding the sentinel leader epoch to indicate topic deletion, it > > >>> seems > > >>> that we need to use a different sentinel value to indicate that the > > >>> leader > > >>> epoch is not present when the controller is still on the old version > > >>> during > > >>> upgrade. > > >>> > > >>> Jun > > >>> > > >>> On Mon, Mar 2, 2020 at 11:20 AM Gwen Shapira > > wrote: > > >>> > > >>> > +1 > > >>> > > > >>> > On Mon, Feb 24, 2020, 2:16 AM David Jacot > > wrote: > > >>> > > > >>> > > Hi all, > > >>> > > > > >>> > > I would like to start a vote on KIP-570: Add leader epoch in > > >>> > > StopReplicaRequest > > >>> > > > > >>> > > The KIP is here: > > >>> > > > > >>> > > > > >>> > > > >>> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-570%3A+Add+leader+epoch+in+StopReplicaRequest > > >>> > > > > >>> > > Thanks, > > >>> > > David > > >>> > > > > >>> > > > >>> > > >> > > >
Re: [VOTE] KIP-570: Add leader epoch in StopReplicaRequest
Is it really true that the controller always sends two requests? Aren't the operations different (stop replica with delete versus stop replica without)? On Wed, Mar 25, 2020, 9:59 AM David Jacot wrote: > Hi all, > > I'd like to inform you that I have slightly changed the schema which was > proposed > in the KIP. During the implementation, I have realized that the proposed > schema > did not work. The new one reorganises how topics/partitions are stored. > > I'd like to amend the current KIP with the following: > > At the moment, the StopReplicaRequest has a top level field named > `DeletePartitions` > which indicates whether the partitions present in the request must be > deleted or not. > The downside of this is that the controller always ends up sending two > StopReplica > requests, one with DeletePartitions=true and one with > DeletePartitions=false. > > Instead, I'd like to add a per-partition DeletePartition field to combine > everything in > one request. This will reduce the number of requests sent to each broker > and also > increase the batching. I've already implemented it. > > I've already updated the schema in the KIP if you want to see it. I will > update the > KIP itself if you agree with the amendment. > > What do you think? Does it sound reasonable? > > Best, > David > > On Fri, Mar 6, 2020 at 3:37 PM David Jacot wrote: > > > Hi all, > > > > The vote has passed with +3 binding votes (Jason Gustafson, Gwen Shapira, > > Jun Rao). > > > > Thanks to everyone! > > > > Best, > > David > > > > On Wed, Mar 4, 2020 at 9:02 AM David Jacot wrote: > > > >> Hi Jun, > >> > >> You're right. I have noticed it while implementing it. I plan to use a > >> default > >> value as a sentinel in the protocol (e.g. -2) to cover this case. > >> > >> David > >> > >> On Wed, Mar 4, 2020 at 3:18 AM Jun Rao wrote: > >> > >>> Hi, David, > >>> > >>> Thanks for the KIP. +1 from me too. Just one comment below. > >>> > >>> 1. Regarding the sentinel leader epoch to indicate topic deletion, it > >>> seems > >>> that we need to use a different sentinel value to indicate that the > >>> leader > >>> epoch is not present when the controller is still on the old version > >>> during > >>> upgrade. > >>> > >>> Jun > >>> > >>> On Mon, Mar 2, 2020 at 11:20 AM Gwen Shapira > wrote: > >>> > >>> > +1 > >>> > > >>> > On Mon, Feb 24, 2020, 2:16 AM David Jacot > wrote: > >>> > > >>> > > Hi all, > >>> > > > >>> > > I would like to start a vote on KIP-570: Add leader epoch in > >>> > > StopReplicaRequest > >>> > > > >>> > > The KIP is here: > >>> > > > >>> > > > >>> > > >>> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-570%3A+Add+leader+epoch+in+StopReplicaRequest > >>> > > > >>> > > Thanks, > >>> > > David > >>> > > > >>> > > >>> > >> >
Re: [VOTE] KIP-570: Add leader epoch in StopReplicaRequest
Hi all, I'd like to inform you that I have slightly changed the schema which was proposed in the KIP. During the implementation, I have realized that the proposed schema did not work. The new one reorganises how topics/partitions are stored. I'd like to amend the current KIP with the following: At the moment, the StopReplicaRequest has a top level field named `DeletePartitions` which indicates whether the partitions present in the request must be deleted or not. The downside of this is that the controller always ends up sending two StopReplica requests, one with DeletePartitions=true and one with DeletePartitions=false. Instead, I'd like to add a per-partition DeletePartition field to combine everything in one request. This will reduce the number of requests sent to each broker and also increase the batching. I've already implemented it. I've already updated the schema in the KIP if you want to see it. I will update the KIP itself if you agree with the amendment. What do you think? Does it sound reasonable? Best, David On Fri, Mar 6, 2020 at 3:37 PM David Jacot wrote: > Hi all, > > The vote has passed with +3 binding votes (Jason Gustafson, Gwen Shapira, > Jun Rao). > > Thanks to everyone! > > Best, > David > > On Wed, Mar 4, 2020 at 9:02 AM David Jacot wrote: > >> Hi Jun, >> >> You're right. I have noticed it while implementing it. I plan to use a >> default >> value as a sentinel in the protocol (e.g. -2) to cover this case. >> >> David >> >> On Wed, Mar 4, 2020 at 3:18 AM Jun Rao wrote: >> >>> Hi, David, >>> >>> Thanks for the KIP. +1 from me too. Just one comment below. >>> >>> 1. Regarding the sentinel leader epoch to indicate topic deletion, it >>> seems >>> that we need to use a different sentinel value to indicate that the >>> leader >>> epoch is not present when the controller is still on the old version >>> during >>> upgrade. >>> >>> Jun >>> >>> On Mon, Mar 2, 2020 at 11:20 AM Gwen Shapira wrote: >>> >>> > +1 >>> > >>> > On Mon, Feb 24, 2020, 2:16 AM David Jacot wrote: >>> > >>> > > Hi all, >>> > > >>> > > I would like to start a vote on KIP-570: Add leader epoch in >>> > > StopReplicaRequest >>> > > >>> > > The KIP is here: >>> > > >>> > > >>> > >>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-570%3A+Add+leader+epoch+in+StopReplicaRequest >>> > > >>> > > Thanks, >>> > > David >>> > > >>> > >>> >>
Re: [VOTE] KIP-570: Add leader epoch in StopReplicaRequest
Hi all, The vote has passed with +3 binding votes (Jason Gustafson, Gwen Shapira, Jun Rao). Thanks to everyone! Best, David On Wed, Mar 4, 2020 at 9:02 AM David Jacot wrote: > Hi Jun, > > You're right. I have noticed it while implementing it. I plan to use a > default > value as a sentinel in the protocol (e.g. -2) to cover this case. > > David > > On Wed, Mar 4, 2020 at 3:18 AM Jun Rao wrote: > >> Hi, David, >> >> Thanks for the KIP. +1 from me too. Just one comment below. >> >> 1. Regarding the sentinel leader epoch to indicate topic deletion, it >> seems >> that we need to use a different sentinel value to indicate that the leader >> epoch is not present when the controller is still on the old version >> during >> upgrade. >> >> Jun >> >> On Mon, Mar 2, 2020 at 11:20 AM Gwen Shapira wrote: >> >> > +1 >> > >> > On Mon, Feb 24, 2020, 2:16 AM David Jacot wrote: >> > >> > > Hi all, >> > > >> > > I would like to start a vote on KIP-570: Add leader epoch in >> > > StopReplicaRequest >> > > >> > > The KIP is here: >> > > >> > > >> > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-570%3A+Add+leader+epoch+in+StopReplicaRequest >> > > >> > > Thanks, >> > > David >> > > >> > >> >
Re: [VOTE] KIP-570: Add leader epoch in StopReplicaRequest
Hi Jun, You're right. I have noticed it while implementing it. I plan to use a default value as a sentinel in the protocol (e.g. -2) to cover this case. David On Wed, Mar 4, 2020 at 3:18 AM Jun Rao wrote: > Hi, David, > > Thanks for the KIP. +1 from me too. Just one comment below. > > 1. Regarding the sentinel leader epoch to indicate topic deletion, it seems > that we need to use a different sentinel value to indicate that the leader > epoch is not present when the controller is still on the old version during > upgrade. > > Jun > > On Mon, Mar 2, 2020 at 11:20 AM Gwen Shapira wrote: > > > +1 > > > > On Mon, Feb 24, 2020, 2:16 AM David Jacot wrote: > > > > > Hi all, > > > > > > I would like to start a vote on KIP-570: Add leader epoch in > > > StopReplicaRequest > > > > > > The KIP is here: > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-570%3A+Add+leader+epoch+in+StopReplicaRequest > > > > > > Thanks, > > > David > > > > > >
Re: [VOTE] KIP-570: Add leader epoch in StopReplicaRequest
Hi, David, Thanks for the KIP. +1 from me too. Just one comment below. 1. Regarding the sentinel leader epoch to indicate topic deletion, it seems that we need to use a different sentinel value to indicate that the leader epoch is not present when the controller is still on the old version during upgrade. Jun On Mon, Mar 2, 2020 at 11:20 AM Gwen Shapira wrote: > +1 > > On Mon, Feb 24, 2020, 2:16 AM David Jacot wrote: > > > Hi all, > > > > I would like to start a vote on KIP-570: Add leader epoch in > > StopReplicaRequest > > > > The KIP is here: > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-570%3A+Add+leader+epoch+in+StopReplicaRequest > > > > Thanks, > > David > > >
Re: [VOTE] KIP-570: Add leader epoch in StopReplicaRequest
+1 On Mon, Feb 24, 2020, 2:16 AM David Jacot wrote: > Hi all, > > I would like to start a vote on KIP-570: Add leader epoch in > StopReplicaRequest > > The KIP is here: > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-570%3A+Add+leader+epoch+in+StopReplicaRequest > > Thanks, > David >
Re: [VOTE] KIP-570: Add leader epoch in StopReplicaRequest
+1 On Mon, Feb 24, 2020 at 2:16 AM David Jacot wrote: > Hi all, > > I would like to start a vote on KIP-570: Add leader epoch in > StopReplicaRequest > > The KIP is here: > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-570%3A+Add+leader+epoch+in+StopReplicaRequest > > Thanks, > David >
[VOTE] KIP-570: Add leader epoch in StopReplicaRequest
Hi all, I would like to start a vote on KIP-570: Add leader epoch in StopReplicaRequest The KIP is here: https://cwiki.apache.org/confluence/display/KAFKA/KIP-570%3A+Add+leader+epoch+in+StopReplicaRequest Thanks, David