>
> I am ok with adding a note in the KIP but the note should say that it has
> an elevated risk for this scenario due to increased probability of having
> an aggressive local cleanup with Tiered Storage.
>

I would like to clarify that there is no elevated risk because of this KIP.
This risk
already exists with the current protocol where an empty follower for a
tiered storage
enabled topic catches up with the leader where it replicates data beginning
from
leader's local log start offset.

In the current KIP, the follower starts from last tiered offset instead
which is usually
much ahead of leader's local log start offset and can never be behind the
leader's
local log start offset. Hence this offset is more likely to be available on
the leader's
local log compared to the local log start offset in this rare scenario.

Therefore, the risk is actually lower when the follower starts with last
tiered offset instead
of the local log start offset.

Abhijeet.



On Wed, Jul 24, 2024 at 7:26 PM Divij Vaidya <divijvaidy...@gmail.com>
wrote:

> The difference between the two scenarios you mentioned is that with Tiered
> Storage, the chances of hitting this scenario increases since a user is
> likely to have an aggressive setting for local disk data cleanup, which
> would not be the case in empty followers catching up in a non-tiered
> storage world.
>
> I am ok with adding a note in the KIP but the note should say that it has
> an elevated risk for this scenario due to increased probability of having
> an aggressive local cleanup with Tiered Storage.
>
> --
> Divij Vaidya
>
>
>
> On Wed, Jul 24, 2024 at 1:22 PM Abhijeet Kumar <abhijeet.cse....@gmail.com
> >
> wrote:
>
> > Hi Divij,
> >
> > The rare scenario we are discussing is similar to an empty follower
> trying
> > to catch up with the leader for a topic that is not enabled with tiered
> > storage. Consider the following steps:
> >
> > 1. Follower requests offset 0 from the leader.
> > 2. Offset 0 is no more valid on the leader as its log start offset is 10,
> > hence leader throws Out of Range error
> > 3. Follower fetches the earliest offset from the leader and gets 10, then
> > resets its Fetch offset to 10.
> > 4. Follower requests offset 10 from the leader, but the previous log
> start
> > offset (10) is deleted from the leader and the new log start offset is
> 15.
> > Hence the leader throws an Out of Range error. The follower goes back to
> > step 3
> >
> > Even in this scenario, theoretically, the follower will never be able to
> > catch up. Since this is an existing problem, that affects even regular
> > replication for topics without tiered storage, should we take this up
> > separately?
> > I can add a small note in the KIP saying that this behavior for follower
> > catchup is similar to a scenario when tiered storage is not enabled.
> >
> > Regards,
> > Abhijeet.
> >
> >
> >
> > On Tue, Jul 23, 2024 at 4:49 PM Divij Vaidya <divijvaidy...@gmail.com>
> > wrote:
> >
> > > Thank you for your response Abhijeet. You have understood the scenario
> > > correctly. For the purpose of discussion, please consider the latter
> case
> > > where offset 11 is not available on the leader anymore (it got cleaned
> > > locally since the last tiered offset is 15). In such a case, you
> > > mentioned, the follower will eventually be able to catch up with the
> > leader
> > > by resetting its fetch offset until the offset is available on the
> > leader's
> > > local log. Correct me if I am wrong but it is not guaranteed that it
> will
> > > eventually catch up because theoretically, everytime it asks for a
> newer
> > > fetch offset, the leader may have deleted it locally. I understand that
> > it
> > > is an edge case scenario which will only happen with configurations for
> > > small segment sizes and aggressive cleaning but nevertheless, it is a
> > > possible scenario.
> > >
> > > Do you agree that theoretically it is possible for the follower to loop
> > > such that it is never able to catch up?
> > >
> > > We can proceed with the KIP with an understanding that this scenario is
> > > rare and we are willing to accept the risk of it. In such a case, we
> > should
> > > add a detection mechanism for such a scenario in the KIP, so that if we
> > > encounter this scenario, the user has a way to detect (and mitigate
> it).
> > > Alternatively, we can change the KIP design to ensure that we never
> > > encounter this scenario. Given the rarity of the scenario, I am ok with
> > > having a detection mechanism (metric?) in place and having this
> scenario
> > > documented as an acceptable risk in current design.
> > >
> > > --
> > > Divij Vaidya
> > >
> > >
> > >
> > > On Tue, Jul 23, 2024 at 11:55 AM Abhijeet Kumar <
> > > abhijeet.cse....@gmail.com>
> > > wrote:
> > >
> > > > Hi Divij,
> > > >
> > > > Seems like there is some confusion about the new protocol for
> fetching
> > > from
> > > > tiered offset.
> > > > The scenario you are highlighting is where,
> > > > Leader's Log Start Offset = 0
> > > > Last Tiered Offset = 10
> > > >
> > > > Following is the sequence of events that will happen:
> > > >
> > > > 1. Follower requests offset 0 from the leader
> > > > 2. Assuming offset 0 is not available locally (to arrive at your
> > > scenario),
> > > > Leader returns OffsetMovedToTieredStorageException
> > > > 3. Follower fetches the earliest pending upload offset and receives
> 11
> > > > 4. Follower builds aux state from [0-10] and sets the fetch offset to
> > 11
> > > > (This step corresponds to step 3 in your email)
> > > >
> > > > At this stage, even if the leader has uploaded more data and the
> > > > last-tiered offset has changed (say to 15), it will not matter
> > > > because offset 11 should still be available on the leader and when
> the
> > > > follower requests data with fetch offset 11, the leader
> > > > will return with a valid partition data response which the follower
> can
> > > > consume and proceed further. Even if the offset 11 is not
> > > > available anymore, the follower will eventually be able to catch up
> > with
> > > > the leader by resetting its fetch offset until the offset
> > > > is available on the leader's local log. Once it catches up,
> replication
> > > on
> > > > the follower can proceed.
> > > >
> > > > Regards,
> > > > Abhijeet.
> > > >
> > > >
> > > >
> > > > On Tue, Jul 2, 2024 at 3:55 PM Divij Vaidya <divijvaidy...@gmail.com
> >
> > > > wrote:
> > > >
> > > > > Hi folks.
> > > > >
> > > > > I am late to the party but I have a question on the proposal.
> > > > >
> > > > > How are we preventing a situation such as the following:
> > > > >
> > > > > 1. Empty follower asks leader for 0
> > > > > 2. Leader compares 0 with last-tiered-offset, and responds with 11
> > > > (where10
> > > > > is last-tiered-offset) and a OffsetMovedToTieredException
> > > > > 3. Follower builds aux state from [0-10] and sets the fetch offset
> to
> > > 11
> > > > > 4. But leader has already uploaded more data and now the new
> > > > > last-tiered-offset is 15
> > > > > 5. Go back to 2
> > > > >
> > > > > This could cause a cycle where the replica will be stuck trying to
> > > > > reconcile with the leader.
> > > > >
> > > > > --
> > > > > Divij Vaidya
> > > > >
> > > > >
> > > > >
> > > > > On Fri, Apr 26, 2024 at 7:28 AM Abhijeet Kumar <
> > > > abhijeet.cse....@gmail.com
> > > > > >
> > > > > wrote:
> > > > >
> > > > > > Thank you all for your comments. As all the comments in the
> thread
> > > are
> > > > > > addressed, I am starting a Vote thread for the KIP. Please have a
> > > look.
> > > > > >
> > > > > > Regards.
> > > > > > Abhijeet.
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Thu, Apr 25, 2024 at 6:08 PM Luke Chen <show...@gmail.com>
> > wrote:
> > > > > >
> > > > > > > Hi, Abhijeet,
> > > > > > >
> > > > > > > Thanks for the update.
> > > > > > >
> > > > > > > I have no more comments.
> > > > > > >
> > > > > > > Luke
> > > > > > >
> > > > > > > On Thu, Apr 25, 2024 at 4:21 AM Jun Rao
> <j...@confluent.io.invalid
> > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hi, Abhijeet,
> > > > > > > >
> > > > > > > > Thanks for the updated KIP. It looks good to me.
> > > > > > > >
> > > > > > > > Jun
> > > > > > > >
> > > > > > > > On Mon, Apr 22, 2024 at 12:08 PM Abhijeet Kumar <
> > > > > > > > abhijeet.cse....@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Jun,
> > > > > > > > >
> > > > > > > > > Please find my comments inline.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, Apr 18, 2024 at 3:26 AM Jun Rao
> > > <j...@confluent.io.invalid
> > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi, Abhijeet,
> > > > > > > > > >
> > > > > > > > > > Thanks for the reply.
> > > > > > > > > >
> > > > > > > > > > 1. I am wondering if we could achieve the same result by
> > just
> > > > > > > lowering
> > > > > > > > > > local.retention.ms and local.retention.bytes. This also
> > > allows
> > > > > the
> > > > > > > > newly
> > > > > > > > > > started follower to build up the local data before
> serving
> > > the
> > > > > > > consumer
> > > > > > > > > > traffic.
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > I am not sure I fully followed this. Do you mean we could
> > lower
> > > > the
> > > > > > > > > local.retention (by size and time)
> > > > > > > > > so that there is little data on the leader's local storage
> so
> > > > that
> > > > > > the
> > > > > > > > > follower can quickly catch up with the leader?
> > > > > > > > >
> > > > > > > > > In that case, we will need to set small local retention
> > across
> > > > > > brokers
> > > > > > > in
> > > > > > > > > the cluster. It will have the undesired
> > > > > > > > > effect where there will be increased remote log fetches for
> > > > serving
> > > > > > > > consume
> > > > > > > > > requests, and this can cause
> > > > > > > > > degradations. Also, this behaviour (of increased remote
> > > fetches)
> > > > > will
> > > > > > > > > happen on all brokers at all times, whereas in
> > > > > > > > > the KIP we are restricting the behavior only to the newly
> > > > > > bootstrapped
> > > > > > > > > brokers and only until the time it fully builds
> > > > > > > > > the local logs as per retention defined at the cluster
> level.
> > > > > > > > > (Deprioritization of the broker could help reduce the
> impact
> > > > > > > > >  even further)
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > 2. Have you updated the KIP?
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > > The KIP has been updated now.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > >
> > > > > > > > > > Jun
> > > > > > > > > >
> > > > > > > > > > On Tue, Apr 9, 2024 at 3:36 AM Satish Duggana <
> > > > > > > > satish.dugg...@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > +1 to Jun for adding the consumer fetching from a
> > follower
> > > > > > scenario
> > > > > > > > > > > also to the existing section that talked about the
> > drawback
> > > > > when
> > > > > > a
> > > > > > > > > > > node built with last-tiered-offset has become a leader.
> > As
> > > > > > Abhijeet
> > > > > > > > > > > mentioned, we plan to have a follow-up KIP that will
> > > address
> > > > > > these
> > > > > > > by
> > > > > > > > > > > having a deprioritzation of these brokers. The
> > > > deprioritization
> > > > > > of
> > > > > > > > > > > those brokers can be removed once they catchup until
> the
> > > > local
> > > > > > log
> > > > > > > > > > > retention.
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > > Satish.
> > > > > > > > > > >
> > > > > > > > > > > On Tue, 9 Apr 2024 at 14:08, Luke Chen <
> > show...@gmail.com>
> > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > Hi Abhijeet,
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks for the KIP to improve the tiered storage
> > feature!
> > > > > > > > > > > >
> > > > > > > > > > > > Questions:
> > > > > > > > > > > > 1. We could also get the "pending-upload-offset" and
> > > epoch
> > > > > via
> > > > > > > > remote
> > > > > > > > > > log
> > > > > > > > > > > > metadata, instead of adding a new API to fetch from
> the
> > > > > leader.
> > > > > > > > Could
> > > > > > > > > > you
> > > > > > > > > > > > explain why you choose the later approach, instead of
> > the
> > > > > > former?
> > > > > > > > > > > > 2.
> > > > > > > > > > > > > We plan to have a follow-up KIP that will address
> > both
> > > > the
> > > > > > > > > > > > deprioritization
> > > > > > > > > > > > of these brokers from leadership, as well as
> > > > > > > > > > > > for consumption (when fetching from followers is
> > > allowed).
> > > > > > > > > > > >
> > > > > > > > > > > > I agree with Jun that we might need to make it clear
> > all
> > > > > > possible
> > > > > > > > > > > drawbacks
> > > > > > > > > > > > that could have. So, could we add the drawbacks that
> > Jun
> > > > > > > mentioned
> > > > > > > > > > about
> > > > > > > > > > > > the performance issue when consumer fetch from
> > follower?
> > > > > > > > > > > >
> > > > > > > > > > > > 3. Could we add "Rejected Alternatives" section to
> the
> > > end
> > > > of
> > > > > > the
> > > > > > > > KIP
> > > > > > > > > > to
> > > > > > > > > > > > add some of them?
> > > > > > > > > > > > Like the "ListOffsetRequest" approach VS
> > > > > > > > > > "Earliest-Pending-Upload-Offset"
> > > > > > > > > > > > approach, or getting the
> > "Earliest-Pending-Upload-Offset"
> > > > > from
> > > > > > > > remote
> > > > > > > > > > log
> > > > > > > > > > > > metadata... etc.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks.
> > > > > > > > > > > > Luke
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Tue, Apr 9, 2024 at 2:25 PM Abhijeet Kumar <
> > > > > > > > > > > abhijeet.cse....@gmail.com>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi Christo,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Please find my comments inline.
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Fri, Apr 5, 2024 at 12:36 PM Christo Lolov <
> > > > > > > > > > christolo...@gmail.com>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hello Abhijeet and Jun,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I have been mulling this KIP over a bit more in
> > > recent
> > > > > > days!
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > re: Jun
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I wasn't aware we apply 2.1 and 2.2 for reserving
> > new
> > > > > > > > timestamps
> > > > > > > > > -
> > > > > > > > > > in
> > > > > > > > > > > > > > retrospect it should have been fairly obvious. I
> > > would
> > > > > need
> > > > > > > to
> > > > > > > > go
> > > > > > > > > > an
> > > > > > > > > > > > > update
> > > > > > > > > > > > > > KIP-1005 myself then, thank you for giving the
> > useful
> > > > > > > > reference!
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 4. I think Abhijeet wants to rebuild state from
> > > > > > > > > > latest-tiered-offset
> > > > > > > > > > > and
> > > > > > > > > > > > > > fetch from latest-tiered-offset + 1 only for new
> > > > replicas
> > > > > > (or
> > > > > > > > > > > replicas
> > > > > > > > > > > > > > which experienced a disk failure) to decrease the
> > > time
> > > > a
> > > > > > > > > partition
> > > > > > > > > > > spends
> > > > > > > > > > > > > > in under-replicated state. In other words, a
> > follower
> > > > > which
> > > > > > > has
> > > > > > > > > > just
> > > > > > > > > > > > > fallen
> > > > > > > > > > > > > > out of ISR, but has local data will continue
> using
> > > > > today's
> > > > > > > > Tiered
> > > > > > > > > > > Storage
> > > > > > > > > > > > > > replication protocol (i.e. fetching from
> > > > > earliest-local). I
> > > > > > > > > further
> > > > > > > > > > > > > believe
> > > > > > > > > > > > > > he has taken this approach so that local state of
> > > > > replicas
> > > > > > > > which
> > > > > > > > > > have
> > > > > > > > > > > > > just
> > > > > > > > > > > > > > fallen out of ISR isn't forcefully wiped thus
> > leading
> > > > to
> > > > > > > > > situation
> > > > > > > > > > 1.
> > > > > > > > > > > > > > Abhijeet, have I understood (and summarised) what
> > you
> > > > are
> > > > > > > > > proposing
> > > > > > > > > > > > > > correctly?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Yes, your understanding is correct. We want to
> > limit
> > > > the
> > > > > > > > behavior
> > > > > > > > > > > changes
> > > > > > > > > > > > > only to new replicas.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > 5. I think in today's Tiered Storage we know the
> > > > leader's
> > > > > > > > > > > > > log-start-offset
> > > > > > > > > > > > > > from the FetchResponse and we can learn its
> > > > > > > > > local-log-start-offset
> > > > > > > > > > > from
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > ListOffsets by asking for earliest-local
> timestamp
> > > > (-4).
> > > > > > But
> > > > > > > > > > granted,
> > > > > > > > > > > > > this
> > > > > > > > > > > > > > ought to be added as an additional API call in
> the
> > > KIP.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > Yes, I clarified this in my reply to Jun. I will
> add
> > > this
> > > > > > > missing
> > > > > > > > > > > detail in
> > > > > > > > > > > > > the KIP.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > > re: Abhijeet
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 101. I am still a bit confused as to why you want
> > to
> > > > > > include
> > > > > > > a
> > > > > > > > > new
> > > > > > > > > > > offset
> > > > > > > > > > > > > > (i.e. pending-upload-offset) when you yourself
> > > mention
> > > > > that
> > > > > > > you
> > > > > > > > > > > could use
> > > > > > > > > > > > > > an already existing offset (i.e.
> > last-tiered-offset +
> > > > 1).
> > > > > > In
> > > > > > > > > > > essence, you
> > > > > > > > > > > > > > end your Motivation with "In this KIP, we will
> > focus
> > > > only
> > > > > > on
> > > > > > > > the
> > > > > > > > > > > follower
> > > > > > > > > > > > > > fetch protocol using the *last-tiered-offset*"
> and
> > > then
> > > > > in
> > > > > > > the
> > > > > > > > > > > following
> > > > > > > > > > > > > > sections you talk about pending-upload-offset. I
> > > > > understand
> > > > > > > > this
> > > > > > > > > > > might be
> > > > > > > > > > > > > > classified as an implementation detail, but if
> you
> > > > > > introduce
> > > > > > > a
> > > > > > > > > new
> > > > > > > > > > > offset
> > > > > > > > > > > > > > (i.e. pending-upload-offset) you have to make a
> > > change
> > > > to
> > > > > > the
> > > > > > > > > > > ListOffsets
> > > > > > > > > > > > > > API (i.e. introduce -6) and thus document it in
> > this
> > > > KIP
> > > > > as
> > > > > > > > such.
> > > > > > > > > > > > > However,
> > > > > > > > > > > > > > the last-tiered-offset ought to already be
> exposed
> > as
> > > > > part
> > > > > > of
> > > > > > > > > > > KIP-1005
> > > > > > > > > > > > > > (under implementation). Am I misunderstanding
> > > something
> > > > > > here?
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > I have tried to clarify this in my reply to Jun.
> > > > > > > > > > > > >
> > > > > > > > > > > > > > The follower needs to build the local data
> starting
> > > > from
> > > > > > the
> > > > > > > > > offset
> > > > > > > > > > > > > > Earliest-Pending-Upload-Offset. Hence it needs
> the
> > > > offset
> > > > > > and
> > > > > > > > the
> > > > > > > > > > > > > > corresponding leader-epoch.
> > > > > > > > > > > > > > There are two ways to do this:
> > > > > > > > > > > > > >    1. We add support in ListOffsetRequest to be
> > able
> > > to
> > > > > > fetch
> > > > > > > > > this
> > > > > > > > > > > offset
> > > > > > > > > > > > > > (and leader epoch) from the leader.
> > > > > > > > > > > > > >    2. Or, fetch the tiered-offset (which is
> already
> > > > > > > supported).
> > > > > > > > > > From
> > > > > > > > > > > this
> > > > > > > > > > > > > > offset, we can get the
> > > Earliest-Pending-Upload-Offset.
> > > > We
> > > > > > can
> > > > > > > > > just
> > > > > > > > > > > add 1
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > the tiered-offset.
> > > > > > > > > > > > > >       However, we still need the leader epoch for
> > > > offset,
> > > > > > > since
> > > > > > > > > > > there is
> > > > > > > > > > > > > > no guarantee that the leader epoch for
> > > > > > > > > > Earliest-Pending-Upload-Offset
> > > > > > > > > > > > > will
> > > > > > > > > > > > > > be the same as the
> > > > > > > > > > > > > >       leader epoch for tiered-offset. We may need
> > > > another
> > > > > > API
> > > > > > > > > call
> > > > > > > > > > > to the
> > > > > > > > > > > > > > leader for this.
> > > > > > > > > > > > > > I prefer the first approach. The only problem
> with
> > > the
> > > > > > first
> > > > > > > > > > > approach is
> > > > > > > > > > > > > > that it introduces one more offset. The second
> > > approach
> > > > > > > avoids
> > > > > > > > > this
> > > > > > > > > > > > > problem
> > > > > > > > > > > > > > but is a little complicated.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > Christo
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Thu, 4 Apr 2024 at 19:37, Jun Rao
> > > > > > > <j...@confluent.io.invalid
> > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi, Abhijeet,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks for the KIP. Left a few comments.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 1. "A drawback of using the last-tiered-offset
> is
> > > > that
> > > > > > this
> > > > > > > > new
> > > > > > > > > > > > > follower
> > > > > > > > > > > > > > > would possess only a limited number of locally
> > > stored
> > > > > > > > segments.
> > > > > > > > > > > Should
> > > > > > > > > > > > > it
> > > > > > > > > > > > > > > ascend to the role of leader, there is a risk
> of
> > > > > needing
> > > > > > to
> > > > > > > > > fetch
> > > > > > > > > > > these
> > > > > > > > > > > > > > > segments from the remote storage, potentially
> > > > impacting
> > > > > > > > broker
> > > > > > > > > > > > > > > performance."
> > > > > > > > > > > > > > > Since we support consumers fetching from
> > followers,
> > > > > this
> > > > > > > is a
> > > > > > > > > > > potential
> > > > > > > > > > > > > > > issue on the follower side too. In theory, it's
> > > > > possible
> > > > > > > for
> > > > > > > > a
> > > > > > > > > > > segment
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > be tiered immediately after rolling. In that
> > case,
> > > > > there
> > > > > > > > could
> > > > > > > > > be
> > > > > > > > > > > very
> > > > > > > > > > > > > > > little data after last-tiered-offset. It would
> be
> > > > > useful
> > > > > > to
> > > > > > > > > think
> > > > > > > > > > > > > through
> > > > > > > > > > > > > > > how to address this issue.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 2. ListOffsetsRequest:
> > > > > > > > > > > > > > > 2.1 Typically, we need to bump up the version
> of
> > > the
> > > > > > > request
> > > > > > > > if
> > > > > > > > > > we
> > > > > > > > > > > add
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > > new value for timestamp. See
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/kafka/pull/10760/files#diff-fac7080d67da905a80126d58fc1745c9a1409de7ef7d093c2ac66a888b134633
> > > > > > > > > > > > > > > .
> > > > > > > > > > > > > > > 2.2 Since this changes the inter broker request
> > > > > protocol,
> > > > > > > it
> > > > > > > > > > would
> > > > > > > > > > > be
> > > > > > > > > > > > > > > useful to have a section on upgrade (e.g. new
> > > > > > > > > > > IBP/metadata.version).
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 3. "Instead of fetching
> > > > Earliest-Pending-Upload-Offset,
> > > > > > it
> > > > > > > > > could
> > > > > > > > > > > fetch
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > last-tiered-offset from the leader, and make a
> > > > separate
> > > > > > > > leader
> > > > > > > > > > > call to
> > > > > > > > > > > > > > > fetch leader epoch for the following offset."
> > > > > > > > > > > > > > > Why do we need to make a separate call for the
> > > leader
> > > > > > > epoch?
> > > > > > > > > > > > > > > ListOffsetsResponse include both the offset and
> > the
> > > > > > > > > corresponding
> > > > > > > > > > > > > epoch.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 4. "Check if the follower replica is empty and
> if
> > > the
> > > > > > > feature
> > > > > > > > > to
> > > > > > > > > > > use
> > > > > > > > > > > > > > > last-tiered-offset is enabled."
> > > > > > > > > > > > > > > Why do we need to check if the follower replica
> > is
> > > > > empty?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 5. It can be confirmed by checking if the
> > leader's
> > > > > > > > > > > Log-Start-Offset is
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > same as the Leader's Local-Log-Start-Offset.
> > > > > > > > > > > > > > > How does the follower know
> > Local-Log-Start-Offset?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Sat, Mar 30, 2024 at 5:51 AM Abhijeet Kumar
> <
> > > > > > > > > > > > > > abhijeet.cse....@gmail.com
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi Christo,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks for reviewing the KIP.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > The follower needs the
> > > > earliest-pending-upload-offset
> > > > > > > (and
> > > > > > > > > the
> > > > > > > > > > > > > > > > corresponding leader epoch) from the leader.
> > > > > > > > > > > > > > > > This is the first offset the follower will
> have
> > > > > > locally.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > > Abhijeet.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Fri, Mar 29, 2024 at 1:14 PM Christo
> Lolov <
> > > > > > > > > > > > > christolo...@gmail.com>
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Heya!
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > First of all, thank you very much for the
> > > > proposal,
> > > > > > you
> > > > > > > > > have
> > > > > > > > > > > > > > explained
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > problem you want solved very well - I
> think a
> > > > > faster
> > > > > > > > > > bootstrap
> > > > > > > > > > > of
> > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > empty
> > > > > > > > > > > > > > > > > replica is definitely an improvement!
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > For my understanding, which concrete offset
> > do
> > > > you
> > > > > > want
> > > > > > > > the
> > > > > > > > > > > leader
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > give
> > > > > > > > > > > > > > > > > back to a follower -
> > > > earliest-pending-upload-offset
> > > > > > or
> > > > > > > > the
> > > > > > > > > > > > > > > > > latest-tiered-offset? If it is the second,
> > > then I
> > > > > > > believe
> > > > > > > > > > > KIP-1005
> > > > > > > > > > > > > > > ought
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > already be exposing that offset as part of
> > the
> > > > > > > > ListOffsets
> > > > > > > > > > > API, no?
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Best,
> > > > > > > > > > > > > > > > > Christo
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Wed, 27 Mar 2024 at 18:23, Abhijeet
> Kumar
> > <
> > > > > > > > > > > > > > > abhijeet.cse....@gmail.com
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Hi All,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > I have created KIP-1023 to introduce
> > follower
> > > > > fetch
> > > > > > > > from
> > > > > > > > > > > tiered
> > > > > > > > > > > > > > > offset.
> > > > > > > > > > > > > > > > > > This feature will be helpful in
> > significantly
> > > > > > > reducing
> > > > > > > > > > Kafka
> > > > > > > > > > > > > > > > > > rebalance/rebuild times when the cluster
> is
> > > > > enabled
> > > > > > > > with
> > > > > > > > > > > tiered
> > > > > > > > > > > > > > > > storage.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-1023%3A+Follower+fetch+from+tiered+offset
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Feedback and suggestions are welcome.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > > > > Abhijeet.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to