[
https://issues.apache.org/jira/browse/KAFKA-15792?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Matthias J. Sax updated KAFKA-15792:
------------------------------------
Issue Type: Bug (was: New Feature)
> Kafka Streams stuck partition fixed after restarting the process
> ----------------------------------------------------------------
>
> Key: KAFKA-15792
> URL: https://issues.apache.org/jira/browse/KAFKA-15792
> Project: Kafka
> Issue Type: Bug
> Components: streams
> Affects Versions: 3.1.2
> Reporter: Patrick Pang
> Priority: Major
>
> Our Kafka Streams process often show slow in processing a particular
> partition on a specific instance. No data skew is detected, i.e. partitions
> are mostly uniformly distributed. Symptom is huge lag on a specific
> partition. We do notice network out is higher on problematic process than
> normal process, often at 3x
> After restarting the process, the lag drains within 5 minutes after startup.
> This hints at internal processing issue of our streams application instead of
> cluster or poison message.
> Is there any metrics you suggest for us to look at, or is this a known issue?
> Regularly bouncing the application doesn't look like a proper fix for
> production systems.
--
This message was sent by Atlassian Jira
(v8.20.10#820010)