Eliel Gomes created KAFKA-19888:
-----------------------------------
Summary: Coordinator histogram negative values causing persistent
write timeouts and consumer instability in Kafka 4.1.0
Key: KAFKA-19888
URL: https://issues.apache.org/jira/browse/KAFKA-19888
Project: Kafka
Issue Type: Bug
Components: group-coordinator
Affects Versions: 4.1.0
Environment: Kafka cluster (KRaft) with 9 brokers
RHEL-based OS (AlmaLinux 9.6)
Running on VMware vSphere
NTP service active on VMs
VMware Tools had "Synchronize time periodically" enabled
Reporter: Eliel Gomes
h2. Summary
In Kafka 4.1.0, a broker entered a corrupted state where the Coordinator began
producing {*}negative histogram values{*}, causing persistent write timeouts to
{{{}__consumer_offsets{}}}, continuous group coordinator instability, and
repeated consumer disconnections.
The issue did *not* self-heal after time synchronization stabilized and only
cleared after restarting the affected broker.
This suggests that the Coordinator histogram/timer subsystem becomes invalid
after certain time irregularities and does not recover automatically.
----
h2. *Description*
We observed repeated Coordinator failures on a single broker in a production
Kafka 4.1.0 cluster. The internal histogram used by the Coordinator runtime
began to record {*}negative values{*}, which should be impossible under correct
operation.
Once this occurs:
* Coordinator write events begin to timeout repeatedly
* Consumers are disconnected and reconnected frequently
* Group coordination is unstable
* Commits failing on __consumer_offsets
* Lag increasing ("fake" lag), because of failures in commits
* The broker remains in a broken state even after the system clock stabilizes
* Only restarting the broker clears the condition
* After the restart, consumer groups immediately select the same broker again
as coordinator, and the problem eventually reappears
This behavior suggests state corruption inside the Coordinator's
timer/histogram logic, where the {{GroupCoordinator}} component is susceptible
to a fatal {{ArrayIndexOutOfBoundsException}} when attempting to record a
negative value in an internal histogram.
This bug appears to be triggered by a non-monotonic clock adjustment (i.e., the
system clock moving backward).
When this exception occurs, the {{group-coordinator-event-processor}} thread
fails, specifically when processing internal events like
{{{*}HighWatermarkUpdate{*}.}}
The failures cause of offset *commit failures* from {*}consumers{*}.
----
h2. *Actual Behavior*
* Coordinator internal histograms observe *negative values*
* The Coordinator *fails to write to {{__consumer_offsets}}*
* Consumers experience recurring instability
* Kafka does not self-recover even when system time becomes stable
* Only a broker restart resolves the condition
----
h2. *Expected behavior*
Kafka should:
* Reject or guard against negative deltas in histogram/timer calculations
* Prevent a corrupted Coordinator state from persisting
* Avoid selecting a malfunctioning broker as group coordinator
----
h2. *Impact*
* Critical production consumer instability
* Continuous rebalances
* Offset commit delays
** "Fake Lags"
* Persistent coordinator malfunction
* Requires broker restarts to recover
* High operational risk
----
h2. *LOGS*
*Server-Side:* The {{GroupCoordinator}} broker log is flooded with the
following exception, indicating the event processor thread is failing:
*Kafka Log - 2025-11-12 at 14:56*
{code:java}
[2025-11-12 14:56:38,166] ERROR [group-coordinator-event-processor-0]: Failed
to run event InternalEvent(name=HighWatermarkUpdate) due to: Histogram recorded
value cannot be negative..
(org.apache.kafka.coordinator.common.runtime.MultiThreadedEventProcessor$EventProcessorThread)
java.lang.ArrayIndexOutOfBoundsException: Histogram recorded value cannot be
negative.
[2025-11-12 14:56:38,167] ERROR [GroupCoordinator id=5] Execution of
HighWatermarkUpdate failed due to Histogram recorded value cannot be negative..
(org.apache.kafka.coordinator.common.runtime.CoordinatorRuntime)
java.lang.ArrayIndexOutOfBoundsException: Histogram recorded value cannot be
negative.
...
[2025-11-12 14:58:38,170] ERROR [group-coordinator-event-processor-2]: Failed
to run event InternalEvent(name=FlushBatch) due to: Histogram recorded value
cannot be negative..
(org.apache.kafka.coordinator.common.runtime.MultiThreadedEventProcessor$EventProcessorThread)
java.lang.ArrayIndexOutOfBoundsException: Histogram recorded value cannot be
negative.
[2025-11-12 14:58:38,170] ERROR [GroupCoordinator id=5] Execution of FlushBatch
failed due to Histogram recorded value cannot be negative..
(org.apache.kafka.coordinator.common.runtime.CoordinatorRuntime)
java.lang.ArrayIndexOutOfBoundsException: Histogram recorded value cannot be
negative.{code}
*Kafka Log - 2025-11-13 09:15*
{code:java}
[2025-11-13 09:15:39,777] ERROR [group-coordinator-event-processor-0]: Failed
to run event InternalEvent(name=HighWatermarkUpdate) due to: Histogram recorded
value cannot be negative..
(org.apache.kafka.coordinator.common.runtime.MultiThreadedEventProcessor$EventProcessorThread)
java.lang.ArrayIndexOutOfBoundsException: Histogram recorded value cannot be
negative.
at
org.HdrHistogram.AbstractHistogram.countsArrayIndex(AbstractHistogram.java:2399)
~[HdrHistogram-2.2.2.jar:?]
at
org.HdrHistogram.AbstractHistogram.recordSingleValue(AbstractHistogram.java:559)
~[HdrHistogram-2.2.2.jar:?]
at
org.HdrHistogram.AbstractHistogram.recordValue(AbstractHistogram.java:467)
~[HdrHistogram-2.2.2.jar:?]
at org.HdrHistogram.Recorder.recordValue(Recorder.java:136)
~[HdrHistogram-2.2.2.jar:?]
at
org.apache.kafka.coordinator.common.runtime.HdrHistogram.record(HdrHistogram.java:100)
~[kafka-coordinator-common-4.1.0.jar:?]
at
org.apache.kafka.coordinator.common.runtime.KafkaMetricHistogram.record(KafkaMetricHistogram.java:128)
~[kafka-coordinator-common-4.1.0.jar:?]
at
org.apache.kafka.common.metrics.Sensor.recordInternal(Sensor.java:236)
~[kafka-clients-4.1.0.jar:?]
at org.apache.kafka.common.metrics.Sensor.record(Sensor.java:197)
~[kafka-clients-4.1.0.jar:?]
at
org.apache.kafka.coordinator.common.runtime.CoordinatorRuntimeMetricsImpl.recordEventQueueTime(CoordinatorRuntimeMetricsImpl.java:284)
~[kafka-coordinator-common-4.1.0.jar:?]
at
org.apache.kafka.coordinator.common.runtime.MultiThreadedEventProcessor$EventProcessorThread.handleEvents(MultiThreadedEventProcessor.java:146)
[kafka-coordinator-common-4.1.0.jar:?]
at
org.apache.kafka.coordinator.common.runtime.MultiThreadedEventProcessor$EventProcessorThread.run(MultiThreadedEventProcessor.java:179)
[kafka-coordinator-common-4.1.0.jar:?]
[2025-11-13 09:15:39,777] ERROR [GroupCoordinator id=5] Execution of
HighWatermarkUpdate failed due to Histogram recorded value cannot be negative..
(org.apache.kafka.coordinator.common.runtime.CoordinatorRuntime)
java.lang.ArrayIndexOutOfBoundsException: Histogram recorded value cannot be
negative.
at
org.HdrHistogram.AbstractHistogram.countsArrayIndex(AbstractHistogram.java:2399)
~[HdrHistogram-2.2.2.jar:?]
at
org.HdrHistogram.AbstractHistogram.recordSingleValue(AbstractHistogram.java:559)
~[HdrHistogram-2.2.2.jar:?]
at
org.HdrHistogram.AbstractHistogram.recordValue(AbstractHistogram.java:467)
~[HdrHistogram-2.2.2.jar:?]
at org.HdrHistogram.Recorder.recordValue(Recorder.java:136)
~[HdrHistogram-2.2.2.jar:?]
at
org.apache.kafka.coordinator.common.runtime.HdrHistogram.record(HdrHistogram.java:100)
~[kafka-coordinator-common-4.1.0.jar:?]
at
org.apache.kafka.coordinator.common.runtime.KafkaMetricHistogram.record(KafkaMetricHistogram.java:128)
~[kafka-coordinator-common-4.1.0.jar:?]
at
org.apache.kafka.common.metrics.Sensor.recordInternal(Sensor.java:236)
~[kafka-clients-4.1.0.jar:?]
at org.apache.kafka.common.metrics.Sensor.record(Sensor.java:197)
~[kafka-clients-4.1.0.jar:?]
at
org.apache.kafka.coordinator.common.runtime.CoordinatorRuntimeMetricsImpl.recordEventQueueTime(CoordinatorRuntimeMetricsImpl.java:284)
~[kafka-coordinator-common-4.1.0.jar:?]
at
org.apache.kafka.coordinator.common.runtime.MultiThreadedEventProcessor$EventProcessorThread.handleEvents(MultiThreadedEventProcessor.java:146)
[kafka-coordinator-common-4.1.0.jar:?]
at
org.apache.kafka.coordinator.common.runtime.MultiThreadedEventProcessor$EventProcessorThread.run(MultiThreadedEventProcessor.java:179)
[kafka-coordinator-common-4.1.0.jar:?] {code}
*Client-Side:* Consumers associated with this (failed) {{GroupCoordinator}}
begin to fail their offset commits. The client logs show a continuous loop
where the consumer:
* Tries to commit and fails with {{This is not the correct coordinator.}}
* Receives an error response {{{}NOT_COORDINATOR{}}}.
* Disconnects and attempts rediscovery.
* Discovers the _exact same_ (broken) broker as the coordinator and repeats
the cycle.
*Consumer Log - 2025-11-12 at 14:58*
{code:java}
2025-11-13 14:58:46.730 WARN [o.a.k.c.c.i.ConsumerCoordinator] [] - [Consumer
clientId=CLIENT_ID_CONSUMER_LOW_PRIORITY, groupId=CONSUMER_GROUP_LOW_PRIORITY]
Offset commit failed on partition SEND_SMS_LOW_PRIORITY-31 at offset 2291423:
The coordinator is not available.
2025-11-13 14:58:46.730 INFO [o.a.k.c.c.i.ConsumerCoordinator] [] - [Consumer
clientId=CLIENT_ID_CONSUMER_LOW_PRIORITY, groupId=CONSUMER_GROUP_LOW_PRIORITY]
Group coordinator X.X.X.X:9092 (id: 2147483642 rack: null) is unavailable or
invalid due to cause: error response COORDINATOR_NOT_AVAILABLE. isDisconnected:
false. Rediscovery will be attempted.
2025-11-12 14:58:46.388 INFO [o.a.k.c.c.i.ConsumerCoordinator] [] - [Consumer
clientId=CLIENT_ID_CONSUMER_LOW_PRIORITY, groupId=CONSUMER_GROUP_LOW_PRIORITY]
Requesting disconnect from last known coordinator 10.0.01:9092 (id: 2147483642
rack: null) 2025-11-12 14:58:46.395 INFO [o.apache.kafka.clients.NetworkClient]
[] - [Consumer clientId=CLIENT_ID_CONSUMER_LOW_PRIORITY,
groupId=CONSUMER_GROUP_LOW_PRIORITY] Client requested disconnect from node
2147483642 2025-11-12 14:58:46.396 INFO [o.apache.kafka.clients.NetworkClient]
[] - [Consumer clientId=CLIENT_ID_CONSUMER_LOW_PRIORITY,
groupId=CONSUMER_GROUP_LOW_PRIORITY] Cancelled in-flight OFFSET_COMMIT request
with correlation id 3436277 due to node 2147483642 being disconnected (elapsed
time since creation: 12ms, elapsed time since send: 12ms, throttle time: 0ms,
request timeout: 30000ms) 2025-11-12 14:58:46.677 INFO
[o.a.k.c.c.i.ConsumerCoordinator] [] - [Consumer
clientId=CLIENT_ID_CONSUMER_LOW_PRIORITY, groupId=CONSUMER_GROUP_LOW_PRIORITY]
Discovered group coordinator X.X.X.X:9092 (id: 2147483642 rack: null) {code}
*Consumer Log - 2025-11-13 at 09:15*
{code:java}
2025-11-13 09:15:46.730 WARN [o.a.k.c.c.i.ConsumerCoordinator] [] - [Consumer
clientId=CLIENT_ID_CONSUMER_LOW_PRIORITY, groupId=CONSUMER_GROUP_LOW_PRIORITY]
Offset commit failed on partition SEND_SMS_LOW_PRIORITY-31 at offset 2291423:
The coordinator is not available.
2025-11-13 09:15:46.730 INFO [o.a.k.c.c.i.ConsumerCoordinator] [] - [Consumer
clientId=CLIENT_ID_CONSUMER_LOW_PRIORITY, groupId=CONSUMER_GROUP_LOW_PRIORITY]
Group coordinator X.X.X.X:9092 (id: 2147483642 rack: null) is unavailable or
invalid due to cause: error response COORDINATOR_NOT_AVAILABLE. isDisconnected:
false. Rediscovery will be attempted.
2025-11-13 09:15:46.730 INFO [o.a.k.c.c.i.ConsumerCoordinator] [] - [Consumer
clientId=CLIENT_ID_CONSUMER_LOW_PRIORITY, groupId=CONSUMER_GROUP_LOW_PRIORITY]
Requesting disconnect from last known coordinator X.X.X.X:9092 (id: 2147483642
rack: null)
2025-11-13 09:15:46.732 INFO [o.apache.kafka.clients.NetworkClient] [] -
[Consumer clientId=CLIENT_ID_CONSUMER_LOW_PRIORITY,
groupId=CONSUMER_GROUP_LOW_PRIORITY] Client requested disconnect from node
2147483642
2025-11-13 09:15:46.949 INFO [o.a.k.c.c.i.ConsumerCoordinator] [] - [Consumer
clientId=CLIENT_ID_CONSUMER_LOW_PRIORITY, groupId=CONSUMER_GROUP_LOW_PRIORITY]
Discovered group coordinator X.X.X.X:9092 (id: 2147483642 rack: null)
2025-11-13 09:15:51.958 INFO [o.a.k.c.c.i.ConsumerCoordinator] [] - [Consumer
clientId=CLIENT_ID_CONSUMER_LOW_PRIORITY, groupId=CONSUMER_GROUP_LOW_PRIORITY]
Group coordinator X.X.X.X:9092 (id: 2147483642 rack: null) is unavailable or
invalid due to cause: error response COORDINATOR_NOT_AVAILABLE. isDisconnected:
false. Rediscovery will be attempted.
2025-11-13 09:15:51.958 INFO [o.a.k.c.c.i.ConsumerCoordinator] [] - [Consumer
clientId=CLIENT_ID_CONSUMER_LOW_PRIORITY, groupId=CONSUMER_GROUP_LOW_PRIORITY]
Requesting disconnect from last known coordinator X.X.X.X:9092 (id: 2147483642
rack: null)
2025-11-13 09:15:51.958 INFO [o.apache.kafka.clients.NetworkClient] [] -
[Consumer clientId=CLIENT_ID_CONSUMER_LOW_PRIORITY,
groupId=CONSUMER_GROUP_LOW_PRIORITY] Client requested disconnect from node
2147483642
2025-11-13 09:15:51.958 INFO [o.apache.kafka.clients.NetworkClient] [] -
[Consumer clientId=CLIENT_ID_CONSUMER_LOW_PRIORITY,
groupId=CONSUMER_GROUP_LOW_PRIORITY] Cancelled in-flight OFFSET_COMMIT request
with correlation id 3557487 due to node 2147483642 being disconnected (elapsed
time since creation: 4908ms, elapsed time since send: 4908ms, throttle time:
0ms, request timeout: 30000ms)
2025-11-13 09:15:52.086 INFO [SendAggregateTemplateRoute] [] - Grouped 2
messages.
2025-11-13 09:15:52.132 INFO [o.a.k.c.c.i.ConsumerCoordinator] [] - [Consumer
clientId=CLIENT_ID_CONSUMER_LOW_PRIORITY, groupId=CONSUMER_GROUP_LOW_PRIORITY]
Discovered group coordinator X.X.X.X:9092 (id: 2147483642 rack: null) {code}
----
*Workaround (Temporary Fix):*
# Restarting the affected broker to force a new {{GroupCoordinator}} election
for the affected consumer groups.
## After the broker restarts, consumers will once again discover it as Group
Coordinator.
----
h2. *NTP*
Brokers of Kafka Cluster and Clients Consumers are with NTP service active
{code:java}
timedatectl
Local time: Fri 2025-11-14 11:07:49 -03
Universal time: Fri 2025-11-14 14:07:49 UTC
RTC time: Fri 2025-11-14 14:07:49
Time zone: America/Sao_Paulo (-03, -0300)
System clock synchronized: yes
NTP service: active
RTC in local TZ: no {code}
----
h2. *Request for Analysis / Questions*
# Does Kafka 4.1.0 require strictly monotonic time for Coordinator
histogram/timer logic?
# Is histogram corruption (negative values) a known issue in 4.x?
# Should Kafka validate histogram deltas to prevent negative values?
# Is this fixed in newer Kafka releases?
# Should CoordinatorRuntime reset its internal state when encountering time
irregularities?
# Is there a recommended workaround to avoid Coordinator entering this
permanent error state?
--
This message was sent by Atlassian Jira
(v8.20.10#820010)