David Arthur created KAFKA-14178: ------------------------------------ Summary: NoOpRecord incorrectly causes high controller queue time metric Key: KAFKA-14178 URL: https://issues.apache.org/jira/browse/KAFKA-14178 Project: Kafka Issue Type: Bug Components: controller, kraft, metrics Reporter: David Arthur Fix For: 3.3.0
When a deferred event is added to the queue in ControllerQuorum, we include the total time it sat in the queue as part of the "EventQueueTimeMs" metric in QuorumControllerMetrics. With the introduction of NoOpRecords, the p99 value for this metric is equal to the frequency that we schedule the no-op records. E.g., if no-op records are scheduled every 5 seconds, we will see p99 EventQueueTimeMs of 5 seconds. This makes it difficult (impossible) to see if there is some delay in the event processing on the controller. -- This message was sent by Atlassian Jira (v8.20.10#820010)