I'm +1 for the change. I think the main advantage is that it simplifies the batching. More like the LeaderAndIsr/UpdateMetadata requests, the request grouping becomes less significant. I'm not sure how much of the benefit can be realized given the need to keep compatibility, but it still seems like a good improvement to the protocol.
-Jason On Thu, Mar 26, 2020 at 6:25 AM David Jacot <dja...@confluent.io> wrote: > I shouldn't have said "always", sorry. > > When a replica is deleted, either due to a topic deletion or a > reassignment, > the controller transitions the replica to the `OfflineReplica` state and > then to > the `ReplicaDeletionStarted` state. The first transition issues a > StopReplicaRequest > with DeletePartitions=false and the second transition issues a > StopReplicaRequest > with DeleatePartitions=true. The two operations are actually doing the same > except that the latter one deletes the replica. > > At the moment, the `ControllerRequestBatch` partitions the accumulated stop > replica requests and sends two requests. > > https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/ControllerChannelManager.scala#L554 > > Having a per-partition flag allows us to combine everything into one > request > here if the requests are sent within the same batch obviously. This won't > guarantee that all requests will be combined though because requests are > batched optimistically in the controller but it opens the door to improve > it in > the future. > > Best, > David > > > One side note: The batching of the two requests depends on how the > > On Wed, Mar 25, 2020 at 6:12 PM Ismael Juma <ism...@juma.me.uk> wrote: > > > Is it really true that the controller always sends two requests? Aren't > the > > operations different (stop replica with delete versus stop replica > > without)? > > > > On Wed, Mar 25, 2020, 9:59 AM David Jacot <dja...@confluent.io> wrote: > > > > > Hi all, > > > > > > I'd like to inform you that I have slightly changed the schema which > was > > > proposed > > > in the KIP. During the implementation, I have realized that the > proposed > > > schema > > > did not work. The new one reorganises how topics/partitions are stored. > > > > > > I'd like to amend the current KIP with the following: > > > > > > At the moment, the StopReplicaRequest has a top level field named > > > `DeletePartitions` > > > which indicates whether the partitions present in the request must be > > > deleted or not. > > > The downside of this is that the controller always ends up sending two > > > StopReplica > > > requests, one with DeletePartitions=true and one with > > > DeletePartitions=false. > > > > > > Instead, I'd like to add a per-partition DeletePartition field to > combine > > > everything in > > > one request. This will reduce the number of requests sent to each > broker > > > and also > > > increase the batching. I've already implemented it. > > > > > > I've already updated the schema in the KIP if you want to see it. I > will > > > update the > > > KIP itself if you agree with the amendment. > > > > > > What do you think? Does it sound reasonable? > > > > > > Best, > > > David > > > > > > On Fri, Mar 6, 2020 at 3:37 PM David Jacot <dja...@confluent.io> > wrote: > > > > > > > Hi all, > > > > > > > > The vote has passed with +3 binding votes (Jason Gustafson, Gwen > > Shapira, > > > > Jun Rao). > > > > > > > > Thanks to everyone! > > > > > > > > Best, > > > > David > > > > > > > > On Wed, Mar 4, 2020 at 9:02 AM David Jacot <dja...@confluent.io> > > wrote: > > > > > > > >> Hi Jun, > > > >> > > > >> You're right. I have noticed it while implementing it. I plan to > use a > > > >> default > > > >> value as a sentinel in the protocol (e.g. -2) to cover this case. > > > >> > > > >> David > > > >> > > > >> On Wed, Mar 4, 2020 at 3:18 AM Jun Rao <j...@confluent.io> wrote: > > > >> > > > >>> Hi, David, > > > >>> > > > >>> Thanks for the KIP. +1 from me too. Just one comment below. > > > >>> > > > >>> 1. Regarding the sentinel leader epoch to indicate topic deletion, > it > > > >>> seems > > > >>> that we need to use a different sentinel value to indicate that the > > > >>> leader > > > >>> epoch is not present when the controller is still on the old > version > > > >>> during > > > >>> upgrade. > > > >>> > > > >>> Jun > > > >>> > > > >>> On Mon, Mar 2, 2020 at 11:20 AM Gwen Shapira <g...@confluent.io> > > > wrote: > > > >>> > > > >>> > +1 > > > >>> > > > > >>> > On Mon, Feb 24, 2020, 2:16 AM David Jacot <dja...@confluent.io> > > > wrote: > > > >>> > > > > >>> > > Hi all, > > > >>> > > > > > >>> > > I would like to start a vote on KIP-570: Add leader epoch in > > > >>> > > StopReplicaRequest > > > >>> > > > > > >>> > > The KIP is here: > > > >>> > > > > > >>> > > > > > >>> > > > > >>> > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-570%3A+Add+leader+epoch+in+StopReplicaRequest > > > >>> > > > > > >>> > > Thanks, > > > >>> > > David > > > >>> > > > > > >>> > > > > >>> > > > >> > > > > > >