[ 
https://issues.apache.org/jira/browse/HADOOP-3280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12590575#action_12590575
 ] 

Rick Cox commented on HADOOP-3280:
----------------------------------

I'd guess there are many users who do not want Hadoop to limit tasks (be they 
Java or streaming). When a cluster exists to run specific tasks, it seems 
reasonable that they can use all of its resources.

On this issue, a default {{ulimit -v}} will cause some pretty strange failures 
while also failing to prevent resource exhaustion in other cases. For example, 
some tasks may mmap multi-GB files but touch only a few pages. Others may link 
libraries that require 100s of MB of address space for code that's never 
executed (and thus never read). Still others may fork off lots of sub-processes 
and thus ultimately consume more RAM than any single process's virtual address 
space. (Btw, these examples are all taken from our deployed Hadoop apps.)

Further, when these tasks hit the virtual address space limit, it's likely 
they'll fail in confusing, difficult to debug ways, since few apps are written 
to gracefully handle that case, and when run outside of Hadoop the same 
commands will work fine unless the user reads the streaming code and notices 
that it is imposing this limit. (This is in contrast to the -Xmx limit, which 
can actually influence the garbage collector to be more aggressive, is a 
commonly used java option, and produces relatively clear OutOfMemoryErrors on 
failure.)

This is why I don't think {{ulimit -v}} is the right approach *in general*. 
That doesn't mean it's not the right approach for specific situations, and 
hence the original proposal for a wrapper script (possibly one mandated by the 
cluster admin) is attractive. In other specific situations, {{ulimit -m}} might 
be more effective than {{ulimit -v}}, or some {{jail}}-like mechanism might be 
employed, and of course Windows users will need something else. Adding support 
for all the different ways resources might be limited to streaming does not 
seem practical.

(I realize this would all have been much more useful to bring up in the 
original issue, and apologize for not following that one more closely. As one 
path forward, we could reopen 2765 and continue this discussion there.)

> virtual address space limits break streaming apps
> -------------------------------------------------
>
>                 Key: HADOOP-3280
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3280
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/streaming
>            Reporter: Rick Cox
>            Priority: Blocker
>             Fix For: 0.17.0
>
>         Attachments: HADOOP-3280_0_20080418.patch
>
>
> HADOOP-2765 added a mandatory, hard virtual address space limit to streaming 
> apps based on the Java process's -Xmx setting.
> This makes it impossible to run a 64-bit streaming app that needs large 
> address spaces under a 32-bit JVM, even if one is otherwise willing to 
> dramatically increase the -Xmx setting without cause. Also, unlike Java's 
> -Xmx limit, the virtual address space limit for an arbitrary UNIX process 
> does not necessarily correspond to RAM usage, so it's likely to be a 
> relatively difficult to configure limit.
> 2765 was originally opened to allow an optional wrapper script around 
> streaming tasks, one use case for which was setting a ulimit. That approach 
> seems much less intrusive and more flexible than the final implementation. 
> The ulimit can also be trivially set by the streaming task itself without any 
> support from Hadoop.
> Marking this as an 0.17 blocker because it will break deployed apps and there 
> is no workaround available.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to