[ 
https://issues.apache.org/jira/browse/HADOOP-3280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12590618#action_12590618
 ] 

Sameer Paranjpye commented on HADOOP-3280:
------------------------------------------

> For example, some tasks may mmap multi-GB files but touch only a few pages. 
> Others may link libraries that require 100s of MB of address space for code 
> that's never 
> executed ...

Fair point. It seems to me we have 3 options:

- Have a way of disabling the ulimit. This doesn't address the general problem 
of constraining/permitting resource usage in arbitrary ways, among other things.
- Allow users/admins to specify a wrapper script. This will clearly work but 
feels like it creates unnecessary work for users and admins. Users/Admins have 
to worry about obtaining the command line from the Tasktracker and execing it. 
This is a chore if all you want to do is set a ulimit or specify a couple of 
environment variables.
- Allow users/admins to specify a prologue script. The prologue is run before 
the task under the same shell. This will let people set resource limits, 
environment variables etc. that affect the task. This may be more limiting than 
a wrapper in some cases, but in those scenarios one always has the option of 
baking the wrapper into the job.

Thoughts?

> virtual address space limits break streaming apps
> -------------------------------------------------
>
>                 Key: HADOOP-3280
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3280
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: Rick Cox
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3280_0_20080418.patch
>
>
> HADOOP-2765 added a mandatory, hard virtual address space limit to streaming 
> apps based on the Java process's -Xmx setting.
> This makes it impossible to run a 64-bit streaming app that needs large 
> address spaces under a 32-bit JVM, even if one is otherwise willing to 
> dramatically increase the -Xmx setting without cause. Also, unlike Java's 
> -Xmx limit, the virtual address space limit for an arbitrary UNIX process 
> does not necessarily correspond to RAM usage, so it's likely to be a 
> relatively difficult to configure limit.
> 2765 was originally opened to allow an optional wrapper script around 
> streaming tasks, one use case for which was setting a ulimit. That approach 
> seems much less intrusive and more flexible than the final implementation. 
> The ulimit can also be trivially set by the streaming task itself without any 
> support from Hadoop.
> Marking this as an 0.17 blocker because it will break deployed apps and there 
> is no workaround available.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to