Re: [DISCUSS] KIP-1166: Improve high-watermark replication

Alyssa Huang Tue, 29 Apr 2025 10:51:50 -0700

Hi José,

Wondering if there is perhaps a typo - under *KRaft Handling *the wording
mentions parking the request if the HWM in the request is >= to the leader's


> If the response is empty and the remote replica's HWM is greater than or
> equal to the leader's HWM, the FETCH request is parked.

Under *Replica Manager* the wording changes to <=

>  the replica manager will also park fetch requests if the remote HWM is
> less than or equal to the replica's HWM.

Under *Compatibility*, it then says we won't park if the remote HWM is >=
to the leader's

>  remote replica's HWM will always be greater that its HWM so the leader
> will never park a FETCH request because of the HWM. This is the behavior
> prior to this KIP.


Thanks!
Alyssa

On Mon, Apr 28, 2025 at 5:43 PM Colin McCabe <[email protected]> wrote:

> Hi José,
>
> Thank you for this much-needed improvement to metadata propagation.
>
> Maybe I missed something, but even after reading the discussion below, I
> still don't understand the rationale for separating the "RPC version too
> old" and "high watermark not known" cases. Is the idea that separating
> these cases will make debugging easier? Like if we see a -1 on the wire, we
> know that it is an unknown HWM situation, and not an older RPC version? Or
> is there some other reason for separating the two?
>
> thanks,
> Colin
>
>
> On Mon, Apr 28, 2025, at 14:27, José Armando García Sancio wrote:
> > Hi David,
> >
> > Thanks for the feedback.
> >
> > On Mon, Apr 28, 2025 at 2:51 PM David Arthur
> > <[email protected]> wrote:
> >> DA1. It might be more clear if we call the field something like
> >> "LastFetchedHighWaterMark" (similar to "LastFetchedEpoch").
> "HighWaterMark"
> >> is already very prevalent in ReplicaManager, so it might be nice to
> have a
> >> different field name :)
> >
> > I would like to keep the name concise. I think that the FETCH request
> > has a LastFetchedEpoch because Kafka has a lot of epochs with
> > different scope and lifecycle. E.g. producer epoch, partition epoch,
> > partition leader epoch and broker epoch. To my knowledge, Kafka only
> > has one high-watermark.
> >
> >> DA2. Why use a default of max int64 instead of -1? Will these two values
> >> have any practical difference? It seems like both values will have the
> >> effect of bypassing the new logic.
> >
> > Jun asked a similar question and I have updated the KIP to answer
> > this. With respect to the FETCH request, I group the values of the
> > HighWatermark fields into 3 categories:
> > 1. Unknown high-watermark. KRaft models this using -1. The replica
> > manager models this using the log start offset.
> > 2. Known high-watermark. The field would have the range of 0 to
> > maximum value of int64, inclusive.
> > 3. The sending replica doesn't support or implement this KIP.
> >
> > The default value in the schema is solving bullet point 3. In this
> > case the HighWatermark field will not be included in the FETCH
> > request. When the HighWatermark field is not specified, Kafka should
> > behave as it does today. Today Kafka doesn't evaluate the HWM when
> > deciding to park FETCH requests. The logic - or predicate - for
> > parking requests can be "local HWM <= remote HWM". This is always true
> > if the remote HWM is the maximum value of int64 and will behave
> > similar to how Kafka behaves today. If we use -1 for this case then
> > the predicate becomes "remote HWM == -1 OR local HWM <= remote HWM."
> >
> >> DA3. Do we always want to return immediately if the leader sees the
> >> follower lagging behind the HWM? Would there be any benefit to allow the
> >> leader to wait a short time for data to accumulate? Something like an
> order
> >> of magnitude less time than MaxWaitMs.
> >
> > That's fair. I would consider this an implementation detail. The
> > replica manager implementation takes a lot of things into account when
> > deciding whether to complete or park the FETCH request. I'll update
> > the design to state that the receiving replica could complete the
> > FETCH request if the "remote HWM < local HWM."
> >
> >> DA4. The motivation section seems focused on KRaft (500ms MaxWaitMs).
> If I
> >> understand correctly, this enhancement will apply to all FETCH, not
> just in
> >> KRaft.
> >
> > Yes. It also applies to the fetcher threads for regular topics in the
> > brokers. I didn't add them in the motivation section because the
> > motivation is not as strong. As Jun pointed out, fetch-from-follower
> > could benefit from this feature but I don't have any strong evidence
> > for it. I think that we haven't seen increased latency with FFF
> > because the fetcher thread batches all of the topic partition into one
> > FETCH request so the HWM would be replicated for one partition because
> > of other partitions in the same FETCH request.
> >
> > Thanks,
> > --
> > -José
>

Re: [DISCUSS] KIP-1166: Improve high-watermark replication

Reply via email to