[ 
https://issues.apache.org/jira/browse/FLINK-6613?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16014357#comment-16014357
 ] 

Dmytro Shkvyra commented on FLINK-6613:
---------------------------------------

Hi [~dernasherbrezon], I have assumed that you say that you using ParallelGC.
Please see 
https://docs.oracle.com/javase/8/docs/technotes/guides/vm/gctuning/parallel.html
 
{quote}
The parallel collector throws an OutOfMemoryError if too much time is being 
spent in garbage collection (GC): If more than 98% of the total time is spent 
in garbage collection and less than 2% of the heap is recovered, then an 
OutOfMemoryError is thrown. This feature is designed to prevent applications 
from running for an extended period of time while making little or no progress 
because the heap is too small. If necessary, this feature can be disabled by 
adding the option -XX:-UseGCOverheadLimit to the command line.
{quote}
and if 
{quote}
3) Send 3300 messages each 635Kb. So total size is ~2G
{quote}
ParallelGC cant collect all garbage in time.
BTW, we have two parallel CG algorithms 
http://www.oracle.com/webfolder/technetwork/tutorials/obe/java/gc01/index.html 
and old one clean old generation also.
I think we can close this ticket, because it could be solved by using GC1 and 
out of scope FLINK 

> OOM during reading big messages from Kafka
> ------------------------------------------
>
>                 Key: FLINK-6613
>                 URL: https://issues.apache.org/jira/browse/FLINK-6613
>             Project: Flink
>          Issue Type: Bug
>          Components: Kafka Connector
>    Affects Versions: 1.2.0
>            Reporter: Andrey
>
> Steps to reproduce:
> 1) Setup Task manager with 2G heap size
> 2) Setup job that reads messages from Kafka 10 (i.e. FlinkKafkaConsumer010)
> 3) Send 3300 messages each 635Kb. So total size is ~2G
> 4) OOM in task manager.
> According to heap dump:
> 1) KafkaConsumerThread read messages with total size ~1G.
> 2) Pass them to the next operator using 
> org.apache.flink.streaming.connectors.kafka.internal.Handover
> 3) Then began to read another batch of messages. 
> 4) Task manager was able to read next batch of ~500Mb messages until OOM.
> Expected:
> 1) Either have constraint like "number of messages in-flight" OR
> 2) Read next batch of messages only when previous batch processed OR
> 3) Any other option which will solve OOM.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

Reply via email to