[
https://issues.apache.org/jira/browse/FLINK-13477?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Till Rohrmann closed FLINK-13477.
---------------------------------
Resolution: Duplicate
This ticket has been solved as part of FLIP-49.
> Containerized TaskManager killed because of lack of memory overhead
> -------------------------------------------------------------------
>
> Key: FLINK-13477
> URL: https://issues.apache.org/jira/browse/FLINK-13477
> Project: Flink
> Issue Type: Improvement
> Components: Deployment / Mesos, Deployment / YARN
> Affects Versions: 1.9.0
> Reporter: Benoit Hanotte
> Priority: Minor
> Labels: pull-request-available
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Currently, the `-XX:MaxDirectMemorySize` parameter is set as:
> `MaxDirectMemorySize = containerMemoryMB - heapSizeMB`
> (see
> [https://github.com/apache/flink/blob/7fec4392b21b07c69ba15ea554731886f181609e/flink-runtime/src/main/java/org/apache/flink/runtime/clusterframework/ContaineredTaskManagerParameters.java#L162])
> However as explained at
> https://docs.oracle.com/javase/8/docs/technotes/tools/unix/java.html,
> `MaxDirectMemorySize` only sets the maximum amount of memory that can be
> used for direct buffers, thus the amount of off-heap memory used can be
> greater than that value, leading to the container being killed by Mesos
> or Yarn as it exceeds the allocated memory.
> In addition, users might want to allocate off-heap memory through native
> code, in which case they will want to keep some of the container memory
> free and unallocated by Flink.
> To solve this issue, we currently set the following parameter:
> {code:java}
> -Dcontainerized.taskmanager.env.FLINK_ENV_JAVA_OPTS='-XX:MaxDirectMemorySize=600m'
> {code}
> which overrides the value that Flink picks (744M in this case) with a lower
> one to keep some overhead memory in the TaskManager containers. However this
> is an "ugly" hack as it goes around the clever memory allocation that Flink
> performs and allows to bypass the sanity checks done in
> `ContaineredTaskManagerParameters`.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)