[ https://issues.apache.org/jira/browse/KAFKA-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Marcin Kuthan updated KAFKA-6925: --------------------------------- Description: The retained heap of org.apache.kafka.streams.processor.internals.StreamThread$StreamsMetricsThreadImpl is surprisingly high for long running job. Over 100MB of heap for every stream after a week of uptime, when for the same application a few hours after start heap takes 2MB. For the problematic instance majority of memory StreamsMetricsThreadImpl is occupied by hash map entries in parentSensors, over 8000 elements 100+kB each. For fresh instance there are less than 200 elements. Below you could find retained set report generated from Eclipse Mat but I'm not fully sure about correctness due to complex object graph in the metrics related code. {code:java} Class Name | Objects | Shallow Heap ----------------------------------------------------------------------------------------------------------- org.apache.kafka.common.metrics.KafkaMetric | 140,476 | 4,495,232 org.apache.kafka.common.MetricName | 140,476 | 4,495,232 org.apache.kafka.common.metrics.stats.SampledStat$Sample | 73,599 | 3,532,752 org.apache.kafka.common.metrics.stats.Meter | 42,104 | 1,347,328 org.apache.kafka.common.metrics.stats.Count | 42,104 | 1,347,328 org.apache.kafka.common.metrics.stats.Rate | 42,104 | 1,010,496 org.apache.kafka.common.metrics.stats.Total | 42,104 | 1,010,496 org.apache.kafka.common.metrics.stats.Max | 28,134 | 900,288 org.apache.kafka.common.metrics.stats.Avg | 28,134 | 900,288 org.apache.kafka.common.metrics.Sensor | 3,164 | 202,496 org.apache.kafka.common.metrics.Sensor[] | 3,164 | 71,088 org.apache.kafka.streams.processor.internals.StreamThread$StreamsMetricsThreadImpl| 1 | 56 ----------------------------------------------------------------------------------------------------------- {code} was: The retained heap of org.apache.kafka.streams.processor.internals.StreamThread$StreamsMetricsThreadImpl is surprisingly high for long running job. Over 100MB of heap for every stream after a week of uptime, when for the same application a few hours after start takes 2MB. For the problematic instance majority of memory StreamsMetricsThreadImpl is occupied by hash map entries in parentSensors, over 8000 elements 100+kB each. For fresh instance there are less than 200 elements. Below you could find retained set report generated from Eclipse Mat but I'm not fully sure about correctness due to complex object graph in the metrics related code. {code:java} Class Name | Objects | Shallow Heap ----------------------------------------------------------------------------------------------------------- org.apache.kafka.common.metrics.KafkaMetric | 140,476 | 4,495,232 org.apache.kafka.common.MetricName | 140,476 | 4,495,232 org.apache.kafka.common.metrics.stats.SampledStat$Sample | 73,599 | 3,532,752 org.apache.kafka.common.metrics.stats.Meter | 42,104 | 1,347,328 org.apache.kafka.common.metrics.stats.Count | 42,104 | 1,347,328 org.apache.kafka.common.metrics.stats.Rate | 42,104 | 1,010,496 org.apache.kafka.common.metrics.stats.Total | 42,104 | 1,010,496 org.apache.kafka.common.metrics.stats.Max | 28,134 | 900,288 org.apache.kafka.common.metrics.stats.Avg | 28,134 | 900,288 org.apache.kafka.common.metrics.Sensor | 3,164 | 202,496 org.apache.kafka.common.metrics.Sensor[] | 3,164 | 71,088 org.apache.kafka.streams.processor.internals.StreamThread$StreamsMetricsThreadImpl| 1 | 56 ----------------------------------------------------------------------------------------------------------- {code} > Memory leak in > org.apache.kafka.streams.processor.internals.StreamThread$StreamsMetricsThreadImpl > ------------------------------------------------------------------------------------------------- > > Key: KAFKA-6925 > URL: https://issues.apache.org/jira/browse/KAFKA-6925 > Project: Kafka > Issue Type: Bug > Components: streams > Affects Versions: 1.0.1 > Reporter: Marcin Kuthan > Priority: Major > > The retained heap of > org.apache.kafka.streams.processor.internals.StreamThread$StreamsMetricsThreadImpl > is surprisingly high for long running job. Over 100MB of heap for every > stream after a week of uptime, when for the same application a few hours > after start heap takes 2MB. > For the problematic instance majority of memory StreamsMetricsThreadImpl is > occupied by hash map entries in parentSensors, over 8000 elements 100+kB > each. For fresh instance there are less than 200 elements. > Below you could find retained set report generated from Eclipse Mat but I'm > not fully sure about correctness due to complex object graph in the metrics > related code. > > {code:java} > Class Name | Objects | Shallow Heap > ----------------------------------------------------------------------------------------------------------- > org.apache.kafka.common.metrics.KafkaMetric | 140,476 | 4,495,232 > org.apache.kafka.common.MetricName | 140,476 | 4,495,232 > org.apache.kafka.common.metrics.stats.SampledStat$Sample | 73,599 | 3,532,752 > org.apache.kafka.common.metrics.stats.Meter | 42,104 | 1,347,328 > org.apache.kafka.common.metrics.stats.Count | 42,104 | 1,347,328 > org.apache.kafka.common.metrics.stats.Rate | 42,104 | 1,010,496 > org.apache.kafka.common.metrics.stats.Total | 42,104 | 1,010,496 > org.apache.kafka.common.metrics.stats.Max | 28,134 | 900,288 > org.apache.kafka.common.metrics.stats.Avg | 28,134 | 900,288 > org.apache.kafka.common.metrics.Sensor | 3,164 | 202,496 > org.apache.kafka.common.metrics.Sensor[] | 3,164 | 71,088 > org.apache.kafka.streams.processor.internals.StreamThread$StreamsMetricsThreadImpl| > 1 | 56 > ----------------------------------------------------------------------------------------------------------- > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)