[jira] Issue Comment Edited: (HADOOP-2765) setting memory limits for tasks

Amareshwari Sri Ramadasu (JIRA) Fri, 29 Feb 2008 04:57:53 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-2765?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12573711#action_12573711
 ]


amareshwari edited comment on HADOOP-2765 at 2/29/08 4:56 AM:
---------------------------------------------------------------------------

The patch addresses following issues:
1. The memory limit setting is done by ulimit -v (instead of ulimit -m as in 
previous patch). Since ulimit -v is the maximum amount of virtual memory 
available for shell.
2. Now that all streaming tasks get same virtual memory as the parent java 
task, lauching a java streaming task (eg. TrApp.class) would require more 
memory than 256MB (set in build-contrib.xml). Noticed this in unit tests. So 
that value for maxmemory is increased to 384MB in build-contrib.xml. This is 
required for the existing unit tests for streaming to pass .
3. Added a testcase in contrib/streaming. The test  will launch a streaming app 
which will allocate 10MB memory.  First, program is launched with sufficient 
memory. And test expects it to succeed. Then program is launched with 
insufficient memory and  is expected to be a failure. 

      was (Author: amareshwari):
    The patch addresses following issues:
1. The memory limit setting is done by ulimit -v (instead of ulimit -m as in 
previous patch). Since ulimit -v is the maximum amount of virtual memory 
available for shell.
2. Now that all streaming tasks get same virtual memory as the parent java 
task, lauching a local job runner in streaming task would require more memory 
than 256MB (set in build-contrib.xml). Noticed this in unit tests. So that 
value is increased to 384MB. Now the unit tests are fine.
3. Added a testcase in contrib/streaming. The test  will launch a streaming app 
which will allocate 10MB memory.  First, program is launched with sufficient 
memory. And test expects it to succeed. Then program is launched with 
insufficient memory and  is expected to be a failure. 
  
> setting memory limits for tasks
> -------------------------------
>
>                 Key: HADOOP-2765
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2765
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: contrib/streaming
>    Affects Versions: 0.15.3
>            Reporter: Joydeep Sen Sarma
>            Assignee: Amareshwari Sri Ramadasu
>             Fix For: 0.16.1
>
>         Attachments: patch-2765.txt, patch-2765.txt, patch-2765.txt, 
> patch-2765.txt, patch-2765.txt, patch-2765.txt
>
>
> here's the motivation:
> we want to put a memory limit on user scripts to prevent runaway scripts from 
> bringing down nodes. this setting is much lower than the max. memory that can 
> be used (since most likely these tend to be scripting bugs). At the same time 
> - for careful users, we want to be able to let them use more memory by 
> overriding this limit.
> there's no good way to do this. we can set ulimit in hadoop shell scripts - 
> but they are very restrictive. there doesn't seem to be a way to do a 
> setrlimit from Java - and setting a ulimit means that supplying a higher Xmx 
> limit from the jobconf is useless (the java process will be limited by the 
> ulimit setting when the tasktracker was launched).
> what we have ended up doing (and i think this might help others as well) is 
> to have a stream.wrapper option. the value of this option is a program 
> through which streaming mapper and reducer scripts are execed. in our case, 
> this wrapper is small C program to do a setrlimit and then exec of the 
> streaming job. the default wrapper puts a reasonable limit on the memory 
> usage - but users can easily override this wrapper (eg by invoking it with 
> different memory limit argument). we can use the wrapper for other system 
> wide resource limits (or any environment settings) as well in future.
> This way - JVMs can stick to mapred.child.opts as the way to control memory 
> usage. This setup has saved our ass on many occasions while allowing 
> sophisticated users to use high memory limits.
> Can submit patch if this sounds interesting.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Issue Comment Edited: (HADOOP-2765) setting memory limits for tasks

Reply via email to