Hi José, Wondering if there is perhaps a typo - under *KRaft Handling *the wording mentions parking the request if the HWM in the request is >= to the leader's
> If the response is empty and the remote replica's HWM is greater than or > equal to the leader's HWM, the FETCH request is parked. Under *Replica Manager* the wording changes to <= > the replica manager will also park fetch requests if the remote HWM is > less than or equal to the replica's HWM. Under *Compatibility*, it then says we won't park if the remote HWM is >= to the leader's > remote replica's HWM will always be greater that its HWM so the leader > will never park a FETCH request because of the HWM. This is the behavior > prior to this KIP. Thanks! Alyssa On Mon, Apr 28, 2025 at 5:43 PM Colin McCabe <cmcc...@apache.org> wrote: > Hi José, > > Thank you for this much-needed improvement to metadata propagation. > > Maybe I missed something, but even after reading the discussion below, I > still don't understand the rationale for separating the "RPC version too > old" and "high watermark not known" cases. Is the idea that separating > these cases will make debugging easier? Like if we see a -1 on the wire, we > know that it is an unknown HWM situation, and not an older RPC version? Or > is there some other reason for separating the two? > > thanks, > Colin > > > On Mon, Apr 28, 2025, at 14:27, José Armando García Sancio wrote: > > Hi David, > > > > Thanks for the feedback. > > > > On Mon, Apr 28, 2025 at 2:51 PM David Arthur > > <david.art...@confluent.io.invalid> wrote: > >> DA1. It might be more clear if we call the field something like > >> "LastFetchedHighWaterMark" (similar to "LastFetchedEpoch"). > "HighWaterMark" > >> is already very prevalent in ReplicaManager, so it might be nice to > have a > >> different field name :) > > > > I would like to keep the name concise. I think that the FETCH request > > has a LastFetchedEpoch because Kafka has a lot of epochs with > > different scope and lifecycle. E.g. producer epoch, partition epoch, > > partition leader epoch and broker epoch. To my knowledge, Kafka only > > has one high-watermark. > > > >> DA2. Why use a default of max int64 instead of -1? Will these two values > >> have any practical difference? It seems like both values will have the > >> effect of bypassing the new logic. > > > > Jun asked a similar question and I have updated the KIP to answer > > this. With respect to the FETCH request, I group the values of the > > HighWatermark fields into 3 categories: > > 1. Unknown high-watermark. KRaft models this using -1. The replica > > manager models this using the log start offset. > > 2. Known high-watermark. The field would have the range of 0 to > > maximum value of int64, inclusive. > > 3. The sending replica doesn't support or implement this KIP. > > > > The default value in the schema is solving bullet point 3. In this > > case the HighWatermark field will not be included in the FETCH > > request. When the HighWatermark field is not specified, Kafka should > > behave as it does today. Today Kafka doesn't evaluate the HWM when > > deciding to park FETCH requests. The logic - or predicate - for > > parking requests can be "local HWM <= remote HWM". This is always true > > if the remote HWM is the maximum value of int64 and will behave > > similar to how Kafka behaves today. If we use -1 for this case then > > the predicate becomes "remote HWM == -1 OR local HWM <= remote HWM." > > > >> DA3. Do we always want to return immediately if the leader sees the > >> follower lagging behind the HWM? Would there be any benefit to allow the > >> leader to wait a short time for data to accumulate? Something like an > order > >> of magnitude less time than MaxWaitMs. > > > > That's fair. I would consider this an implementation detail. The > > replica manager implementation takes a lot of things into account when > > deciding whether to complete or park the FETCH request. I'll update > > the design to state that the receiving replica could complete the > > FETCH request if the "remote HWM < local HWM." > > > >> DA4. The motivation section seems focused on KRaft (500ms MaxWaitMs). > If I > >> understand correctly, this enhancement will apply to all FETCH, not > just in > >> KRaft. > > > > Yes. It also applies to the fetcher threads for regular topics in the > > brokers. I didn't add them in the motivation section because the > > motivation is not as strong. As Jun pointed out, fetch-from-follower > > could benefit from this feature but I don't have any strong evidence > > for it. I think that we haven't seen increased latency with FFF > > because the fetcher thread batches all of the topic partition into one > > FETCH request so the HWM would be replicated for one partition because > > of other partitions in the same FETCH request. > > > > Thanks, > > -- > > -José >