[ 
https://issues.apache.org/jira/browse/KAFKA-20635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

David Jacot resolved KAFKA-20635.
---------------------------------
    Fix Version/s: 4.3.1
                   4.4.0
       Resolution: Fixed

> Spurious "Writing records..." failed errors in the group coordinator after 
> partition leadership change
> ------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-20635
>                 URL: https://issues.apache.org/jira/browse/KAFKA-20635
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 4.0.0, 4.1.0, 4.2.0, 4.3.0
>            Reporter: David Jacot
>            Assignee: David Jacot
>            Priority: Minor
>             Fix For: 4.3.1, 4.4.0
>
>
> During routine __consumer_offsets partition leadership changes, the group 
> coordinator spams ERROR-level logs for every in-flight write at the moment of 
> transition:
> {noformat}
> [GroupCoordinator id=N] Writing records to __consumer_offsets-N failed due 
> to: For requests intended only for the leader, this error indicates that the 
> broker is not the current leader ...
> [GroupCoordinator id=N] Execution of FlushBatch failed due to For requests 
> intended only for the leader, this error indicates that the broker is not the 
> current leader ...
> {noformat}
> These appear on the group coordinator that lost leadership and last for the 
> duration of the in-flight batch queue. The behavior is correct — 
> NotLeaderOrFollowerException propagates through failCurrentBatch to the 
> deferred events and is mapped to NOT_COORDINATOR for clients via 
> CoordinatorOperationExceptionHelper, so clients retry against the new 
> coordinator. This is purely a logging-noise issue.
> Same root cause as KAFKA-20634: the partition transitions to follower 
> synchronously while the coordinator unload is async. In that window, 
> partitionWriter.append calls replicaManager.appendRecordsToLeader which 
> legitimately rejects writes for a partition no longer led by this broker. The 
> exception is expected — but it gets logged at ERROR by the catch block in 
> flushCurrentBatch and by CoordinatorInternalEvent.complete.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to