[ https://issues.apache.org/jira/browse/FLINK-6613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16014357#comment-16014357 ]
Dmytro Shkvyra commented on FLINK-6613: --------------------------------------- Hi [~dernasherbrezon], I have assumed that you say that you using ParallelGC. Please see https://docs.oracle.com/javase/8/docs/technotes/guides/vm/gctuning/parallel.html {quote} The parallel collector throws an OutOfMemoryError if too much time is being spent in garbage collection (GC): If more than 98% of the total time is spent in garbage collection and less than 2% of the heap is recovered, then an OutOfMemoryError is thrown. This feature is designed to prevent applications from running for an extended period of time while making little or no progress because the heap is too small. If necessary, this feature can be disabled by adding the option -XX:-UseGCOverheadLimit to the command line. {quote} and if {quote} 3) Send 3300 messages each 635Kb. So total size is ~2G {quote} ParallelGC cant collect all garbage in time. BTW, we have two parallel CG algorithms http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/gc01/index.html and old one clean old generation also. I think we can close this ticket, because it could be solved by using GC1 and out of scope FLINK > OOM during reading big messages from Kafka > ------------------------------------------ > > Key: FLINK-6613 > URL: https://issues.apache.org/jira/browse/FLINK-6613 > Project: Flink > Issue Type: Bug > Components: Kafka Connector > Affects Versions: 1.2.0 > Reporter: Andrey > > Steps to reproduce: > 1) Setup Task manager with 2G heap size > 2) Setup job that reads messages from Kafka 10 (i.e. FlinkKafkaConsumer010) > 3) Send 3300 messages each 635Kb. So total size is ~2G > 4) OOM in task manager. > According to heap dump: > 1) KafkaConsumerThread read messages with total size ~1G. > 2) Pass them to the next operator using > org.apache.flink.streaming.connectors.kafka.internal.Handover > 3) Then began to read another batch of messages. > 4) Task manager was able to read next batch of ~500Mb messages until OOM. > Expected: > 1) Either have constraint like "number of messages in-flight" OR > 2) Read next batch of messages only when previous batch processed OR > 3) Any other option which will solve OOM. -- This message was sent by Atlassian JIRA (v6.3.15#6346)