[jira] [Commented] (KAFKA-20114) Fix race between requestInFlight and backoffDeadlineMs in RPCProducerIdManager causing premature retries

Sean Quah (Jira) Tue, 03 Feb 2026 10:24:04 -0800


    [ 
https://issues.apache.org/jira/browse/KAFKA-20114?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18056279#comment-18056279
 ]


Sean Quah commented on KAFKA-20114:
-----------------------------------

Thanks, looking forward to your PR!

> Fix race between requestInFlight and backoffDeadlineMs in 
> RPCProducerIdManager causing premature retries
> --------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-20114
>                 URL: https://issues.apache.org/jira/browse/KAFKA-20114
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: sanghyeok An
>            Assignee: sanghyeok An
>            Priority: Minor
>              Labels: producer, transaction
>         Attachments: image-2026-02-03-08-53-05-655.png
>
>
> RPCProducerIdManager uses two independent atomics, requestInFlight and 
> backoffDeadlineMs. There is a remaining race that can cause premature retries 
> when maybeRequestNextBlock reads an outdated backoffDeadlineMs and then a 
> concurrent in-flight failure applies a new backoff and clears requestInFlight.
> If the interleaving happens such that:
>  * maybeRequestNextBlock reads backoffDeadlineMs before the failure handler 
> updates it, and
>  * the failure handler clears requestInFlight before maybeRequestNextBlock 
> attempts compareAndSet,
> then maybeRequestNextBlock can successfully set requestInFlight and call 
> sendRequest immediately, effectively ignoring the newly applied retry backoff.
>  
> !image-2026-02-03-08-53-05-655.png|width=1040,height=538!
>  
>  
>  
> *Previous discussion in other PR*
> [https://github.com/apache/kafka/pull/21279#issuecomment-3836196135]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

[jira] [Commented] (KAFKA-20114) Fix race between requestInFlight and backoffDeadlineMs in RPCProducerIdManager causing premature retries

Reply via email to