[ 
https://issues.apache.org/jira/browse/HADOOP-3280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12591645#action_12591645
 ] 

Devaraj Das commented on HADOOP-3280:
-------------------------------------

Summarizing the points: 
1) We don't want to have bash related dependencies in the config.
2) We ideally want to have a wrapper for tasks.
3) We don't want to have default values for the ulimit setting.
4) Allen, in an offline email, brought up another point - that tasks can spawn 
processes from within. In that case, the problem gets to be managing the ulimit 
for all of them. 

For (3) and (4), sounds like we should somehow cap the memory usage on a per 
user basis and that should address the needs. Most OS already supports that. 
But today hadoop can't make use of that feature since all tasks run with the 
uid of the tasktracker. (2) is a possible way to inject the ulimit and other 
env settings in the config (and also have some actions taken after a task 
exits). (2), in its entirety, doesn't seem to be in the scope of this issue.

So, in short, it seems like we cannot do a very good job of restricting 
resource usage without knowing the uid of the tasks on a per task basis when 
the tasks could be from different jobs submitted by different users.

IMO, options we are left with are: (1) do at least what Arun did in his patch 
with the default value set to _no limit_ (2) revert 2765 and deal with this 
problem when we have better infrastructure to do with uids for tasks.

Thoughts?

> virtual address space limits break streaming apps
> -------------------------------------------------
>
>                 Key: HADOOP-3280
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3280
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: Rick Cox
>            Assignee: Amareshwari Sriramadasu
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3280_0_20080418.patch, patch-3280.txt, 
> patch-3280.txt
>
>
> HADOOP-2765 added a mandatory, hard virtual address space limit to streaming 
> apps based on the Java process's -Xmx setting.
> This makes it impossible to run a 64-bit streaming app that needs large 
> address spaces under a 32-bit JVM, even if one is otherwise willing to 
> dramatically increase the -Xmx setting without cause. Also, unlike Java's 
> -Xmx limit, the virtual address space limit for an arbitrary UNIX process 
> does not necessarily correspond to RAM usage, so it's likely to be a 
> relatively difficult to configure limit.
> 2765 was originally opened to allow an optional wrapper script around 
> streaming tasks, one use case for which was setting a ulimit. That approach 
> seems much less intrusive and more flexible than the final implementation. 
> The ulimit can also be trivially set by the streaming task itself without any 
> support from Hadoop.
> Marking this as an 0.17 blocker because it will break deployed apps and there 
> is no workaround available.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to