Jon Buffington created KAFKA-5122:
-------------------------------------

             Summary: Kafka Streams off-heap memory leak
                 Key: KAFKA-5122
                 URL: https://issues.apache.org/jira/browse/KAFKA-5122
             Project: Kafka
          Issue Type: Bug
          Components: streams
    Affects Versions: 0.10.2.0
         Environment: Linux 64-bit
Oracle JVM version "1.8.0_121"
            Reporter: Jon Buffington


I have a Kafka Streams application that leaks off-heap memory at a rate of 20MB 
per commit interval. The application is configured with a 1G heap; the heap 
memory does not show signs of leaking. The application reaches 16g of system 
memory usage before terminating and restarting.

Application facts:
* The data pipeline is source -> map -> groupByKey -> reduce -> to.
* The reduce operation uses a tumbling time window 
TimeWindows.of(TimeUnit.HOURS.toMillis(1)).until(TimeUnit.HOURS.toMillis(168)).
* The commit interval is five minutes (300000ms).
* The application links to v0.10.2.0-cp1 of the Kakfa libraries. When I link to 
the current 0.10.2.1 RC3, the leak rate changes to ~10MB per commit interval.
* The application uses the schema registry for two pairs of serdes. One serde 
pair is used to read from a source topic that has 40 partitions. The other 
serde pair is used by the internal changelog and repartition topics created by 
the groupByKey/reduce operations.
* The source input rate varies between 500-1500 records/sec. The source rate 
variation does not change the size or frequency of the leak.
* The application heap has been configured using both 1024m and 2048m. The only 
observed difference between the two JVM heap sizes is more old gen collections 
at 1024m although there is little difference in throughput. JVM settings are 
{-server -Djava.awt.headless=true -Xss256k -XX:MaxMetaspaceSize=128m 
-XX:ReservedCodeCacheSize=64m -XX:CompressedClassSpaceSize=32m 
-XX:MaxDirectMemorySize=128m -XX:+AlwaysPreTouch -XX:+UseG1GC 
-XX:MaxGCPauseMillis=50 -XX:InitiatingHeapOccupancyPercent=35 
-XX:+PerfDisableSharedMem -XX:+UseStringDeduplication 
-XX:MinMetaspaceFreeRatio=50 -XX:MaxMetaspaceFreeRatio=80}
* We configure a custom RocksDBConfigSetter to set 
options.setMaxBackgroundCompactions(Runtime.getRuntime.availableProcessors)
* Per 
<http://mail-archives.apache.org/mod_mbox/kafka-users/201702.mbox/%3ccahwhrrxxpwgyvr1ctwgoudkr7cqkaq+52phfpuzs4j-wv7k...@mail.gmail.com%3e>,
 the SSTables are being compacted. Total disk usage for the state files 
(RocksDB) is ~2.5g. Per partition and window, there are 3-4 SSTables.
* The application is written in Scala and compiled using version 2.12.1.
• Oracle JVM version "1.8.0_121"

Various experiments that had no effect on the leak rate:
* Tried different RocksDB block sizes (4k, 16k, and 32k).
* Different numbers of instances (1, 2, and 4).
* Different numbers of threads (1, 4, 10, 40).



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to