Hello Kafka Community,
I would like to start a discussion on KIP-1350, which proposes exposing Streams 
group heartbeat statuses as Kafka Streams metrics.

- KIP: 
https://cwiki.apache.org/confluence/display/KAFKA/KIP-1350%3A+Expose+Streams+Group+Heartbeat+Statuses+as+Kafka+Streams+Metrics

When Kafka Streams uses the Streams rebalance protocol, each stream thread 
receives StreamsGroupHeartbeatResponse messages from the group coordinator.
These responses may include statuses such as MISSING_SOURCE_TOPICS, 
ASSIGNMENT_DELAYED, or INCORRECTLY_PARTITIONED_TOPICS.

Today, these statuses are used internally or logged, but they are not exposed 
as metrics. 
In large deployments, this makes it harder to understand why a specific stream 
thread is not processing or why assignment is delayed without inspecting client 
logs.

This KIP proposes adding a thread-level metric, streams-group-status, tagged by 
thread-id and status, to report whether the latest heartbeat response for a 
stream thread contains a given status.
The goal is to make Streams rebalance protocol issues easier to diagnose from 
metrics.

Looking forward to the community's feedback.

Best regards,
Sanghyeok An.

Reply via email to