[
https://issues.apache.org/jira/browse/HADOOP-3280?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12590575#action_12590575
]
Rick Cox commented on HADOOP-3280:
----------------------------------
I'd guess there are many users who do not want Hadoop to limit tasks (be they
Java or streaming). When a cluster exists to run specific tasks, it seems
reasonable that they can use all of its resources.
On this issue, a default {{ulimit -v}} will cause some pretty strange failures
while also failing to prevent resource exhaustion in other cases. For example,
some tasks may mmap multi-GB files but touch only a few pages. Others may link
libraries that require 100s of MB of address space for code that's never
executed (and thus never read). Still others may fork off lots of sub-processes
and thus ultimately consume more RAM than any single process's virtual address
space. (Btw, these examples are all taken from our deployed Hadoop apps.)
Further, when these tasks hit the virtual address space limit, it's likely
they'll fail in confusing, difficult to debug ways, since few apps are written
to gracefully handle that case, and when run outside of Hadoop the same
commands will work fine unless the user reads the streaming code and notices
that it is imposing this limit. (This is in contrast to the -Xmx limit, which
can actually influence the garbage collector to be more aggressive, is a
commonly used java option, and produces relatively clear OutOfMemoryErrors on
failure.)
This is why I don't think {{ulimit -v}} is the right approach *in general*.
That doesn't mean it's not the right approach for specific situations, and
hence the original proposal for a wrapper script (possibly one mandated by the
cluster admin) is attractive. In other specific situations, {{ulimit -m}} might
be more effective than {{ulimit -v}}, or some {{jail}}-like mechanism might be
employed, and of course Windows users will need something else. Adding support
for all the different ways resources might be limited to streaming does not
seem practical.
(I realize this would all have been much more useful to bring up in the
original issue, and apologize for not following that one more closely. As one
path forward, we could reopen 2765 and continue this discussion there.)
> virtual address space limits break streaming apps
> -------------------------------------------------
>
> Key: HADOOP-3280
> URL: https://issues.apache.org/jira/browse/HADOOP-3280
> Project: Hadoop Core
> Issue Type: Bug
> Components: contrib/streaming
> Reporter: Rick Cox
> Priority: Blocker
> Fix For: 0.17.0
>
> Attachments: HADOOP-3280_0_20080418.patch
>
>
> HADOOP-2765 added a mandatory, hard virtual address space limit to streaming
> apps based on the Java process's -Xmx setting.
> This makes it impossible to run a 64-bit streaming app that needs large
> address spaces under a 32-bit JVM, even if one is otherwise willing to
> dramatically increase the -Xmx setting without cause. Also, unlike Java's
> -Xmx limit, the virtual address space limit for an arbitrary UNIX process
> does not necessarily correspond to RAM usage, so it's likely to be a
> relatively difficult to configure limit.
> 2765 was originally opened to allow an optional wrapper script around
> streaming tasks, one use case for which was setting a ulimit. That approach
> seems much less intrusive and more flexible than the final implementation.
> The ulimit can also be trivially set by the streaming task itself without any
> support from Hadoop.
> Marking this as an 0.17 blocker because it will break deployed apps and there
> is no workaround available.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.