[ 
https://issues.apache.org/jira/browse/KAFKA-5122?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Guozhang Wang resolved KAFKA-5122.
----------------------------------
    Resolution: Not A Problem

> Kafka Streams unexpected off-heap memory growth
> -----------------------------------------------
>
>                 Key: KAFKA-5122
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5122
>             Project: Kafka
>          Issue Type: Bug
>          Components: streams
>    Affects Versions: 0.10.2.0
>         Environment: Linux 64-bit
> Oracle JVM version "1.8.0_121"
>            Reporter: Jon Buffington
>            Assignee: Guozhang Wang
>            Priority: Minor
>
> I have a Kafka Streams application that leaks off-heap memory at a rate of 
> 20MB per commit interval. The application is configured with a 1G heap; the 
> heap memory does not show signs of leaking. The application reaches 16g of 
> system memory usage before terminating and restarting.
> Application facts:
> * The data pipeline is source -> map -> groupByKey -> reduce -> to.
> * The reduce operation uses a tumbling time window 
> TimeWindows.of(TimeUnit.HOURS.toMillis(1)).until(TimeUnit.HOURS.toMillis(168)).
> * The commit interval is five minutes (300000ms).
> * The application links to v0.10.2.0-cp1 of the Kakfa libraries. When I link 
> to the current 0.10.2.1 RC3, the leak rate changes to ~10MB per commit 
> interval.
> * The application uses the schema registry for two pairs of serdes. One serde 
> pair is used to read from a source topic that has 40 partitions. The other 
> serde pair is used by the internal changelog and repartition topics created 
> by the groupByKey/reduce operations.
> * The source input rate varies between 500-1500 records/sec. The source rate 
> variation does not change the size or frequency of the leak.
> * The application heap has been configured using both 1024m and 2048m. The 
> only observed difference between the two JVM heap sizes is more old gen 
> collections at 1024m although there is little difference in throughput. JVM 
> settings are {-server -Djava.awt.headless=true -Xss256k 
> -XX:MaxMetaspaceSize=128m -XX:ReservedCodeCacheSize=64m 
> -XX:CompressedClassSpaceSize=32m -XX:MaxDirectMemorySize=128m 
> -XX:+AlwaysPreTouch -XX:+UseG1GC -XX:MaxGCPauseMillis=50 
> -XX:InitiatingHeapOccupancyPercent=35 -XX:+PerfDisableSharedMem 
> -XX:+UseStringDeduplication -XX:MinMetaspaceFreeRatio=50 
> -XX:MaxMetaspaceFreeRatio=80}
> * We configure a custom RocksDBConfigSetter to set 
> options.setMaxBackgroundCompactions(Runtime.getRuntime.availableProcessors)
> * Per 
> <http://mail-archives.apache.org/mod_mbox/kafka-users/201702.mbox/%3ccahwhrrxxpwgyvr1ctwgoudkr7cqkaq+52phfpuzs4j-wv7k...@mail.gmail.com%3e>,
>  the SSTables are being compacted. Total disk usage for the state files 
> (RocksDB) is ~2.5g. Per partition and window, there are 3-4 SSTables.
> * The application is written in Scala and compiled using version 2.12.1.
> • Oracle JVM version "1.8.0_121"
> Various experiments that had no effect on the leak rate:
> * Tried different RocksDB block sizes (4k, 16k, and 32k).
> * Different numbers of instances (1, 2, and 4).
> * Different numbers of threads (1, 4, 10, 40).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to