[
https://issues.apache.org/jira/browse/HADOOP-2765?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Amareshwari Sri Ramadasu updated HADOOP-2765:
---------------------------------------------
Attachment: patch-2765.txt
> setting memory limits for tasks
> -------------------------------
>
> Key: HADOOP-2765
> URL: https://issues.apache.org/jira/browse/HADOOP-2765
> Project: Hadoop Core
> Issue Type: New Feature
> Components: contrib/streaming
> Affects Versions: 0.15.3
> Reporter: Joydeep Sen Sarma
> Assignee: Amareshwari Sri Ramadasu
> Fix For: 0.16.1
>
> Attachments: patch-2765.txt, patch-2765.txt, patch-2765.txt,
> patch-2765.txt, patch-2765.txt, patch-2765.txt
>
>
> here's the motivation:
> we want to put a memory limit on user scripts to prevent runaway scripts from
> bringing down nodes. this setting is much lower than the max. memory that can
> be used (since most likely these tend to be scripting bugs). At the same time
> - for careful users, we want to be able to let them use more memory by
> overriding this limit.
> there's no good way to do this. we can set ulimit in hadoop shell scripts -
> but they are very restrictive. there doesn't seem to be a way to do a
> setrlimit from Java - and setting a ulimit means that supplying a higher Xmx
> limit from the jobconf is useless (the java process will be limited by the
> ulimit setting when the tasktracker was launched).
> what we have ended up doing (and i think this might help others as well) is
> to have a stream.wrapper option. the value of this option is a program
> through which streaming mapper and reducer scripts are execed. in our case,
> this wrapper is small C program to do a setrlimit and then exec of the
> streaming job. the default wrapper puts a reasonable limit on the memory
> usage - but users can easily override this wrapper (eg by invoking it with
> different memory limit argument). we can use the wrapper for other system
> wide resource limits (or any environment settings) as well in future.
> This way - JVMs can stick to mapred.child.opts as the way to control memory
> usage. This setup has saved our ass on many occasions while allowing
> sophisticated users to use high memory limits.
> Can submit patch if this sounds interesting.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.