[ https://issues.apache.org/jira/browse/FLINK-8622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Aljoscha Krettek updated FLINK-8622: ------------------------------------ Component/s: ResourceManager Distributed Coordination > flink-mesos: High memory usage of scheduler + job manager. GC never kicks in. > ----------------------------------------------------------------------------- > > Key: FLINK-8622 > URL: https://issues.apache.org/jira/browse/FLINK-8622 > Project: Flink > Issue Type: Bug > Components: Distributed Coordination, Mesos, ResourceManager > Affects Versions: 1.4.0, 1.3.2 > Reporter: Bhumika Bayani > Priority: Blocker > Fix For: 1.5.0 > > > We are deploying a 1 job manager + 6 taskmanager flink cluster on mesos. > We have observed that the memory usage for 'jobmanager' is high. In spite of > allocating more and more memory resources to it, it hits the limit within > minutes. > We had started with 1.5 GB RAM and 1 GB heap. Currently we have allocated 4 > GB RAM, 3 GB heap to jobmanager cum scheduler. We tried allocating 8GB RAM > and lesser heap (i.e. same, 3GB) too. In that case also, memory graph was > identical. > As per the graph below, the scheduler almost always runs with maximum memory > resources. > !flink-mem-usage-graph-for-jira.png! > > Throughout the run of the scheduler, we do not see memory usage going down > unless it is killed due to OOM. So inferring, garbage collection is never > happening. > We have tried using both flink versions 1.4 and 1.3 but could see same issue > on both versions. > > Is there any way we can find out where and how memory is being used? > Are there any flink config options for jobmanager or jvm parameters which can > help us restrict the memory usage, force garbage collection, and prevent it > from crash? > Please let us know if there any resource recommendations from Flink for > running Flink on mesos at scale. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)