Re: [VOTE] KIP-570: Add leader epoch in StopReplicaRequest

Jason Gustafson Thu, 26 Mar 2020 11:14:22 -0700

I'm +1 for the change. I think the main advantage is that it simplifies the
batching. More like the LeaderAndIsr/UpdateMetadata requests, the request
grouping becomes less significant. I'm not sure how much of the benefit can
be realized given the need to keep compatibility, but it still seems like a
good improvement to the protocol.


-Jason

On Thu, Mar 26, 2020 at 6:25 AM David Jacot <dja...@confluent.io> wrote:

> I shouldn't have said "always", sorry.
>
> When a replica is deleted, either due to a topic deletion or a
> reassignment,
> the controller transitions the replica to the `OfflineReplica` state and
> then to
> the `ReplicaDeletionStarted` state. The first transition issues a
> StopReplicaRequest
> with DeletePartitions=false and the second transition issues a
> StopReplicaRequest
> with DeleatePartitions=true. The two operations are actually doing the same
> except that the latter one deletes the replica.
>
> At the moment, the `ControllerRequestBatch` partitions the accumulated stop
> replica requests and sends two requests.
>
> https://github.com/apache/kafka/blob/trunk/core/src/main/scala/kafka/controller/ControllerChannelManager.scala#L554
>
> Having a per-partition flag allows us to combine everything into one
> request
> here if the requests are sent within the same batch obviously. This won't
> guarantee that all requests will be combined though because requests are
> batched optimistically in the controller but it opens the door to improve
> it in
> the future.
>
> Best,
> David
>
>
> One side note: The batching of the two requests depends on how the
>
> On Wed, Mar 25, 2020 at 6:12 PM Ismael Juma <ism...@juma.me.uk> wrote:
>
> > Is it really true that the controller always sends two requests? Aren't
> the
> > operations different (stop replica with delete versus stop replica
> > without)?
> >
> > On Wed, Mar 25, 2020, 9:59 AM David Jacot <dja...@confluent.io> wrote:
> >
> > > Hi all,
> > >
> > > I'd like to inform you that I have slightly changed the schema which
> was
> > > proposed
> > > in the KIP. During the implementation, I have realized that the
> proposed
> > > schema
> > > did not work. The new one reorganises how topics/partitions are stored.
> > >
> > > I'd like to amend the current KIP with the following:
> > >
> > > At the moment, the StopReplicaRequest has a top level field named
> > > `DeletePartitions`
> > > which indicates whether the partitions present in the request must be
> > > deleted or not.
> > > The downside of this is that the controller always ends up sending two
> > > StopReplica
> > > requests, one with DeletePartitions=true and one with
> > > DeletePartitions=false.
> > >
> > > Instead, I'd like to add a per-partition DeletePartition field to
> combine
> > > everything in
> > > one request. This will reduce the number of requests sent to each
> broker
> > > and also
> > > increase the batching. I've already implemented it.
> > >
> > > I've already updated the schema in the KIP if you want to see it. I
> will
> > > update the
> > > KIP itself if you agree with the amendment.
> > >
> > > What do you think? Does it sound reasonable?
> > >
> > > Best,
> > > David
> > >
> > > On Fri, Mar 6, 2020 at 3:37 PM David Jacot <dja...@confluent.io>
> wrote:
> > >
> > > > Hi all,
> > > >
> > > > The vote has passed with +3 binding votes (Jason Gustafson, Gwen
> > Shapira,
> > > > Jun Rao).
> > > >
> > > > Thanks to everyone!
> > > >
> > > > Best,
> > > > David
> > > >
> > > > On Wed, Mar 4, 2020 at 9:02 AM David Jacot <dja...@confluent.io>
> > wrote:
> > > >
> > > >> Hi Jun,
> > > >>
> > > >> You're right. I have noticed it while implementing it. I plan to
> use a
> > > >> default
> > > >> value as a sentinel in the protocol (e.g. -2) to cover this case.
> > > >>
> > > >> David
> > > >>
> > > >> On Wed, Mar 4, 2020 at 3:18 AM Jun Rao <j...@confluent.io> wrote:
> > > >>
> > > >>> Hi, David,
> > > >>>
> > > >>> Thanks for the KIP. +1 from me too. Just one comment below.
> > > >>>
> > > >>> 1. Regarding the sentinel leader epoch to indicate topic deletion,
> it
> > > >>> seems
> > > >>> that we need to use a different sentinel value to indicate that the
> > > >>> leader
> > > >>> epoch is not present when the controller is still on the old
> version
> > > >>> during
> > > >>> upgrade.
> > > >>>
> > > >>> Jun
> > > >>>
> > > >>> On Mon, Mar 2, 2020 at 11:20 AM Gwen Shapira <g...@confluent.io>
> > > wrote:
> > > >>>
> > > >>> > +1
> > > >>> >
> > > >>> > On Mon, Feb 24, 2020, 2:16 AM David Jacot <dja...@confluent.io>
> > > wrote:
> > > >>> >
> > > >>> > > Hi all,
> > > >>> > >
> > > >>> > > I would like to start a vote on KIP-570: Add leader epoch in
> > > >>> > > StopReplicaRequest
> > > >>> > >
> > > >>> > > The KIP is here:
> > > >>> > >
> > > >>> > >
> > > >>> >
> > > >>>
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-570%3A+Add+leader+epoch+in+StopReplicaRequest
> > > >>> > >
> > > >>> > > Thanks,
> > > >>> > > David
> > > >>> > >
> > > >>> >
> > > >>>
> > > >>
> > >
> >
>

Re: [VOTE] KIP-570: Add leader epoch in StopReplicaRequest

Reply via email to