Bhumika Bayani created FLINK-8622:
-------------------------------------

             Summary: flink-mesos: High memory usage of scheduler + job 
manager. GC never kicks in.
                 Key: FLINK-8622
                 URL: https://issues.apache.org/jira/browse/FLINK-8622
             Project: Flink
          Issue Type: Bug
    Affects Versions: 1.3.2, 1.4.0
            Reporter: Bhumika Bayani


We are deploying a 1 job manager + 6 taskmanager flink cluster on mesos.

We have observed that the memory usage for 'jobmanager' is high. In spite of 
allocating more and more memory resources to it, it hits the limit within 
minutes.

We had started with 1.5 GB RAM and 1 GB heap. Currently we have allocated 4 GB 
RAM, 3 GB heap to jobmanager cum scheduler. We tried allocating 8GB RAM and 
lesser heap (i.e. same, 3GB) too. In that case also, memory graph was identical.

As per the graph below, the scheduler almost always runs with maximum memory 
resources.

!flink-mem-usage-graph-for-jira.png!

 

Throughout the run of the scheduler, we do not see memory usage going down 
unless it is killed due to OOM. So inferring, garbage collection is never 
happening.

We have tried using both flink versions 1.4 and 1.3 but could see same issue on 
both versions.

 

Is there any way we can find out where and how memory is being used? 

Are there any flink config options for jobmanager or jvm parameters which can 
help us restrict the memory usage, force garbage collection, and prevent it 
from crash? 

Please let us know if there any resource recommendations from Flink for running 
Flink on mesos at scale.

 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to