[ https://issues.apache.org/jira/browse/KAFKA-6925?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
John Roesler updated KAFKA-6925: -------------------------------- Description: *Note: this issue was fixed incidentally in 2.0, so it is only present in versions 0.x and 1.x.* The retained heap of org.apache.kafka.streams.processor.internals.StreamThread$StreamsMetricsThreadImpl is surprisingly high for long running job. Over 100MB of heap for every stream after a week of uptime, when for the same application a few hours after start heap takes 2MB. For the problematic instance majority of memory StreamsMetricsThreadImpl is occupied by hash map entries in parentSensors, over 8000 elements 100+kB each. For fresh instance there are less than 200 elements. Below you could find retained set report generated from Eclipse Mat but I'm not fully sure about correctness due to complex object graph in the metrics related code. Number of objects in single StreamThread$StreamsMetricsThreadImpl instance. {code:java} Class Name | Objects | Shallow Heap ----------------------------------------------------------------------------------------------------------- org.apache.kafka.common.metrics.KafkaMetric | 140,476 | 4,495,232 org.apache.kafka.common.MetricName | 140,476 | 4,495,232 org.apache.kafka.common.metrics.stats.SampledStat$Sample | 73,599 | 3,532,752 org.apache.kafka.common.metrics.stats.Meter | 42,104 | 1,347,328 org.apache.kafka.common.metrics.stats.Count | 42,104 | 1,347,328 org.apache.kafka.common.metrics.stats.Rate | 42,104 | 1,010,496 org.apache.kafka.common.metrics.stats.Total | 42,104 | 1,010,496 org.apache.kafka.common.metrics.stats.Max | 28,134 | 900,288 org.apache.kafka.common.metrics.stats.Avg | 28,134 | 900,288 org.apache.kafka.common.metrics.Sensor | 3,164 | 202,496 org.apache.kafka.common.metrics.Sensor[] | 3,164 | 71,088 org.apache.kafka.streams.processor.internals.StreamThread$StreamsMetricsThreadImpl| 1 | 56 ----------------------------------------------------------------------------------------------------------- {code} was: The retained heap of org.apache.kafka.streams.processor.internals.StreamThread$StreamsMetricsThreadImpl is surprisingly high for long running job. Over 100MB of heap for every stream after a week of uptime, when for the same application a few hours after start heap takes 2MB. For the problematic instance majority of memory StreamsMetricsThreadImpl is occupied by hash map entries in parentSensors, over 8000 elements 100+kB each. For fresh instance there are less than 200 elements. Below you could find retained set report generated from Eclipse Mat but I'm not fully sure about correctness due to complex object graph in the metrics related code. Number of objects in single StreamThread$StreamsMetricsThreadImpl instance. {code:java} Class Name | Objects | Shallow Heap ----------------------------------------------------------------------------------------------------------- org.apache.kafka.common.metrics.KafkaMetric | 140,476 | 4,495,232 org.apache.kafka.common.MetricName | 140,476 | 4,495,232 org.apache.kafka.common.metrics.stats.SampledStat$Sample | 73,599 | 3,532,752 org.apache.kafka.common.metrics.stats.Meter | 42,104 | 1,347,328 org.apache.kafka.common.metrics.stats.Count | 42,104 | 1,347,328 org.apache.kafka.common.metrics.stats.Rate | 42,104 | 1,010,496 org.apache.kafka.common.metrics.stats.Total | 42,104 | 1,010,496 org.apache.kafka.common.metrics.stats.Max | 28,134 | 900,288 org.apache.kafka.common.metrics.stats.Avg | 28,134 | 900,288 org.apache.kafka.common.metrics.Sensor | 3,164 | 202,496 org.apache.kafka.common.metrics.Sensor[] | 3,164 | 71,088 org.apache.kafka.streams.processor.internals.StreamThread$StreamsMetricsThreadImpl| 1 | 56 ----------------------------------------------------------------------------------------------------------- {code} > Memory leak in > org.apache.kafka.streams.processor.internals.StreamThread$StreamsMetricsThreadImpl > ------------------------------------------------------------------------------------------------- > > Key: KAFKA-6925 > URL: https://issues.apache.org/jira/browse/KAFKA-6925 > Project: Kafka > Issue Type: Bug > Components: streams > Affects Versions: 0.11.0.2, 1.1.0, 1.0.1 > Reporter: Marcin Kuthan > Assignee: John Roesler > Priority: Major > > *Note: this issue was fixed incidentally in 2.0, so it is only present in > versions 0.x and 1.x.* > > The retained heap of > org.apache.kafka.streams.processor.internals.StreamThread$StreamsMetricsThreadImpl > is surprisingly high for long running job. Over 100MB of heap for every > stream after a week of uptime, when for the same application a few hours > after start heap takes 2MB. > For the problematic instance majority of memory StreamsMetricsThreadImpl is > occupied by hash map entries in parentSensors, over 8000 elements 100+kB > each. For fresh instance there are less than 200 elements. > Below you could find retained set report generated from Eclipse Mat but I'm > not fully sure about correctness due to complex object graph in the metrics > related code. Number of objects in single > StreamThread$StreamsMetricsThreadImpl instance. > > {code:java} > Class Name | Objects | Shallow Heap > ----------------------------------------------------------------------------------------------------------- > org.apache.kafka.common.metrics.KafkaMetric | 140,476 | 4,495,232 > org.apache.kafka.common.MetricName | 140,476 | 4,495,232 > org.apache.kafka.common.metrics.stats.SampledStat$Sample | 73,599 | 3,532,752 > org.apache.kafka.common.metrics.stats.Meter | 42,104 | 1,347,328 > org.apache.kafka.common.metrics.stats.Count | 42,104 | 1,347,328 > org.apache.kafka.common.metrics.stats.Rate | 42,104 | 1,010,496 > org.apache.kafka.common.metrics.stats.Total | 42,104 | 1,010,496 > org.apache.kafka.common.metrics.stats.Max | 28,134 | 900,288 > org.apache.kafka.common.metrics.stats.Avg | 28,134 | 900,288 > org.apache.kafka.common.metrics.Sensor | 3,164 | 202,496 > org.apache.kafka.common.metrics.Sensor[] | 3,164 | 71,088 > org.apache.kafka.streams.processor.internals.StreamThread$StreamsMetricsThreadImpl| > 1 | 56 > ----------------------------------------------------------------------------------------------------------- > {code} > -- This message was sent by Atlassian JIRA (v7.6.3#76005)