Patrick Pang created KAFKA-15792:
------------------------------------
Summary: Kafka Streams stuck partition fixed after restarting the
process
Key: KAFKA-15792
URL: https://issues.apache.org/jira/browse/KAFKA-15792
Project: Kafka
Issue Type: New Feature
Components: streams
Affects Versions: 3.1.2
Reporter: Patrick Pang
Our Kafka Streams process often show slow in processing a particular partition
on a specific instance. No data skew is detected, i.e. partitions are mostly
uniformly distributed. Symptom is huge lag on a specific partition.
After restarting the process, the lag drains within 5 minutes after startup.
This hints at internal processing issue of our streams application instead of
cluster or poison message.
Is there any metrics you suggest for us to look at, or is this a known issue?
Regularly bouncing the application doesn't look like a proper fix for
production systems.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)