Tamas Kornai created KAFKA-20449:
------------------------------------
Summary: OffsetFetcherUtils.updateSubscriptionState logs at WARN
for benign race condition during rebalance
Key: KAFKA-20449
URL: https://issues.apache.org/jira/browse/KAFKA-20449
Project: Kafka
Issue Type: Bug
Components: clients, consumer
Affects Versions: 4.1.2, 4.3.0, 4.2.1
Reporter: Tamas Kornai
KAFKA-20131 introduced a log.warn() in
OffsetFetcherUtils.updateSubscriptionState() for the case where a LIST_OFFSETS
response arrives after the partition has already been revoked:
{code:java}
log.warn("Not updating high watermark for partition {} as it is no longer
assigned", partition);
{code}
This is a benign race condition that naturally occurs during consumer group
rebalances — a LIST_OFFSETS request is issued, the partition gets revoked
before the response returns, and the response is simply
discarded. No data loss or correctness issue arises.
However, the WARN level is problematic in practice. Many organizations use
warning log rates as a signal for canary release health. Since rebalances are
frequent during rolling deployments, canary instances
generate a high volume of these warnings, which triggers automated rollback of
otherwise healthy releases.
The log should be downgraded to DEBUG (or at most INFO), since it describes
expected, harmless behavior that requires no operator action.
Affected code:
clients/src/main/java/org/apache/kafka/clients/consumer/internals/OffsetFetcherUtils.java,
lines 281-285 (introduced in commit abcbef6a4c, PR #21457).
Both log statements are affected:
- "Not updating high watermark for partition {} as it is no longer assigned"
(READ_UNCOMMITTED)
- "Not updating last stable offset for partition {} as it is no longer
assigned" (READ_COMMITTED)
--
This message was sent by Atlassian Jira
(v8.20.10#820010)