[jira] [Commented] (KAFKA-19354) KRaft observer unable to recover after re-bootstrapping to follower

Justin Chen (Jira) Wed, 04 Jun 2025 09:27:04 -0700


    [ 
https://issues.apache.org/jira/browse/KAFKA-19354?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17956147#comment-17956147
 ]


Justin Chen commented on KAFKA-19354:
-------------------------------------

[~jsancio] Re-ran the test with a network delay of 1500ms instead of 2500ms, 
and was able to re-produce the observer (kafka-0) getting stuck fetching from 
follower node (7003) instead of the leader (7002).

Logs: https://gist.github.com/justin-chen/717f8fb1c066a72d3ef443f945583a9c


> KRaft observer unable to recover after re-bootstrapping to follower
> -------------------------------------------------------------------
>
>                 Key: KAFKA-19354
>                 URL: https://issues.apache.org/jira/browse/KAFKA-19354
>             Project: Kafka
>          Issue Type: Bug
>          Components: kraft
>    Affects Versions: 4.0.0
>            Reporter: Justin Chen
>            Assignee: Alyssa Huang
>            Priority: Major
>
> [Original dev mail 
> thread|https://lists.apache.org/thread/ws3390khsxhdg2b8cnv2mzv8slz5xq7q]
> If an observer's FETCH request to the quorum leader experiences a 
> failure/timeout, it is possible that when it re-bootstraps, it will connect 
> to a follower node (random selection). Subsequently, the observer node will 
> continually send FETCH requests to that follower, and in receive a response 
> with a "partitionError" errorCode=6 (NOT_LEADER_OR_FOLLOWER), which does not 
> trigger a re-bootstrap.
> Thus, the observer will be stuck sending FETCH requests to the follower 
> instead of the leader, halting metadata replication and causing it to fall 
> out of sync.
> To recover from this state, re-bootstrapping would need to occur by 
> restarting the affected observer or follower, until it connects to the 
> correct leader.
> *Steps to reproduce:*
> 1. Spin up Kafka cluster with 3 or 5 controllers. (ideally 5 to increase 
> likelihood of bootstrapping to a follower instead of the leader)
> 2. Enable a network delay on a particular observer broker (e.g. `tc qdisc add 
> dev eth0 root netem delay 2500ms`). I picked 2500ms since default timeout is 
> 2s for 
> `controller.quorum.fetch.timeout.ms`/`controller.quorum.request.timeout.ms`. 
> After a few seconds, disable the network delay (e.g. `tc qdisc del dev eth0 
> root netem`).
> 3. The observer node will re-bootstrap, potentially to a follower instead of 
> the leader. If so, the observer will continuously send fetch requests to the 
> follower node, receive `NOT_LEADER_OR_FOLLOWER` in response, and will no 
> longer replicate metadata.
> *Debug logs demonstrating this scenario:*
> - https://gist.github.com/justin-chen/1f3eee79d9a5066a467818a0b1bc006f
> - kraftcontroller-3 (leader), kraftcontroller-4 (follower), kafka-0 (observer)



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-19354) KRaft observer unable to recover after re-bootstrapping to follower

Reply via email to