[
https://issues.apache.org/jira/browse/KAFKA-20635?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
David Jacot resolved KAFKA-20635.
---------------------------------
Fix Version/s: 4.3.1
4.4.0
Resolution: Fixed
> Spurious "Writing records..." failed errors in the group coordinator after
> partition leadership change
> ------------------------------------------------------------------------------------------------------
>
> Key: KAFKA-20635
> URL: https://issues.apache.org/jira/browse/KAFKA-20635
> Project: Kafka
> Issue Type: Bug
> Affects Versions: 4.0.0, 4.1.0, 4.2.0, 4.3.0
> Reporter: David Jacot
> Assignee: David Jacot
> Priority: Minor
> Fix For: 4.3.1, 4.4.0
>
>
> During routine __consumer_offsets partition leadership changes, the group
> coordinator spams ERROR-level logs for every in-flight write at the moment of
> transition:
> {noformat}
> [GroupCoordinator id=N] Writing records to __consumer_offsets-N failed due
> to: For requests intended only for the leader, this error indicates that the
> broker is not the current leader ...
> [GroupCoordinator id=N] Execution of FlushBatch failed due to For requests
> intended only for the leader, this error indicates that the broker is not the
> current leader ...
> {noformat}
> These appear on the group coordinator that lost leadership and last for the
> duration of the in-flight batch queue. The behavior is correct —
> NotLeaderOrFollowerException propagates through failCurrentBatch to the
> deferred events and is mapped to NOT_COORDINATOR for clients via
> CoordinatorOperationExceptionHelper, so clients retry against the new
> coordinator. This is purely a logging-noise issue.
> Same root cause as KAFKA-20634: the partition transitions to follower
> synchronously while the coordinator unload is async. In that window,
> partitionWriter.append calls replicaManager.appendRecordsToLeader which
> legitimately rejects writes for a partition no longer led by this broker. The
> exception is expected — but it gets logged at ERROR by the catch block in
> flushCurrentBatch and by CoordinatorInternalEvent.complete.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)