GitHub user rmetzger opened a pull request:
https://github.com/apache/flink/pull/709
[YARN] Cut off 25% of the heap as a safety margin
Users were reporting issues with Flink on YARN because the NodeManager was
killing containers due to resource overuse.
When a user requests for example a 4GB TaskManager, we can not just set the
-Xmx value to 4G, because then the linux process (thats what YARN is
monitoring) is growing bigger than 4GB. Also, YARN is very very strict with
that limit.
So what we do is we remove a certain amount of the user specified memory.
In the past, we were using 15%, but that was apparently not enough.
I've played around a lot and it seems that 25% is a good value.
This pull request also fixes potential null pointer exceptions in the web
frontend and adds thread and classloader logging to the taskamangers.
Please discuss whether you want this.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/rmetzger/flink rm_master
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/flink/pull/709.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #709
----
commit 506bef7405975852ec4121d64c426dd63dca5bb5
Author: Robert Metzger <[email protected]>
Date: 2015-05-19T17:38:02Z
[jobmanager] Fix potential null pointer exception in jobmanager webfrontend
commit 40cb1f94870618f915c27e02badd4864668bed66
Author: Robert Metzger <[email protected]>
Date: 2015-05-21T11:49:02Z
[yarn] Increase default heap cutoff to 25%
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---