Re: [DISCUSS] KIP-1236: Adjust quorum-related config lower bounds

TaiJu Wu Wed, 12 Nov 2025 19:00:41 -0800

Hi Maros,

Thanks for your reply and following is my understanding.


The reason is the controller.fetch.timeout.Ms is here
<https://github.com/apache/kafka/blob/2c9180e8d181798bf55e37801cb16d79c36f9ff4/raft/src/main/java/org/apache/kafka/raft/KafkaRaftClient.java#L553>
(1)
, and leader will set the beginQuorumRequestTimeout here
<https://github.com/apache/kafka/blob/6b187e9ff9374711a10452cc3aaa903837d937ba/raft/src/main/java/org/apache/kafka/raft/LeaderState.java#L156>
 (2)
(2) comes from (1) and the Raft.fetch.timeout.Ms is 500Ms so we need to set
controller.fetch.timeout.Ms at least 1000.

Another point is, we need to give controller.fetch.timeout.Ms enough time
to round trip even network latency equals raft max fetch request time.

Improve your example and assume network latency always 100ms
T=0ms:     Follower sends fetch request and starts 500ms *fetchTimer*
T=500ms: Leader does not receive fetch request (RAFT_MAX_FETCH_WAIT_MS e.g.
500)
                 do two things:
                 1) Leader sneds BeginQuorum request to the follower
                 2) The follower sends fetch request to leader and start
*fetchTimer*
T=600ms: four cases
                 1) the follower receives BeginQuorum and response it (100
network latency) -> leader still works
                 2) the follower does not receive beginQuorum request ->
leader MAYBE not work
                 1) the leader receives fetch and response it (100 network
latency) -> follower still works
                 2) the leader does not receive fetch Request -> follower
MAYBE not work

T=800ms: Follower's fetch timer is expired.

then the follower transitions to the *Prospective state*, and it triggers
election... (wasted resources even the leader is working fine).

>From the above flow, we give the leader and follower two changes to check
each state so the algorithm works
(one is beginQuorum, another is fetch but their start point be different).

Best,
TaiJu Wu


Maroš Orsák <[email protected]> 於 2025年11月12日 週三 下午10:54寫道：

> Hi,
>
> So basically, one of the examples of how we could violate such an invariant
> is that when, f.e., `controller.quorum.fetch.timeout.ms = 800ms`:
>
> T=0ms:     Follower sends fetch request and starts 800ms *fetchTimer*
> T=100ms: Leader receives request (100ms network latency)
> ...
> T=600ms: Leader waits 500ms for new data (MAX_FETCH_WAIT_MS)
> T=600ms: Leader sends response
> T=700ms: Response arrives at follower (100ms network latency)
> ...
> T=800ms: Follower *fetchTimer *expires.
>
> then the follower transitions to the *Prospective state*, and it triggers
> election... (wasted resources even leader is working fine).
>
> Is my understanding correct? If so, maybe adding such an example would be
> great to add to KIP?
>
> Maros.
>

Re: [DISCUSS] KIP-1236: Adjust quorum-related config lower bounds

Reply via email to