[
https://issues.apache.org/jira/browse/HADOOP-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12613351#action_12613351
]
Hemanth Yamijala commented on HADOOP-3581:
------------------------------------------
Brice,
bq. I think that there is a more general problem, that is task insulation,
because a bugged process could many other things than just overloading the
memory.
True. In our environments, we have seen processes overloading memory as an oft
repeated problem. Hence, we were focussing on that.
bq. The userBasedInsulator.sh that I proposed in HADOOP-3675 could solve this
issue (and a few others) in an easier way.
So, the goal of this JIRA was to prevent user tasks from adversely affecting
one another, or other system daemons on the node by gobbling up memory. We
could not find an out-of-the-box OS solution to curtail a process and it's
descendents' memory limits. Specifically, ulimit did not seem to handle
processes spawned from a parent whose memory limit was set. Maybe virtual
machines will help, but IMO, we are still some way off from deciding which tool
is suitable for this. That is why this JIRA proposes implementing the tracking
of memory on its own.
If you are aware of a way this can be achieved with an OS specific mechanism,
we can gladly look at that. And it would be significantly easy to use the
mechanism you propose (via the wrapper script). Then we can focus on
HADOOP-3675. Please do let us know of any solution that you have in mind.
> Prevent memory intensive user tasks from taking down nodes
> ----------------------------------------------------------
>
> Key: HADOOP-3581
> URL: https://issues.apache.org/jira/browse/HADOOP-3581
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Hemanth Yamijala
> Assignee: Vinod Kumar Vavilapalli
> Attachments: patch_3581_0.1.txt
>
>
> Sometimes user Map/Reduce applications can get extremely memory intensive,
> maybe due to some inadvertent bugs in the user code, or the amount of data
> processed. When this happens, the user tasks start to interfere with the
> proper execution of other processes on the node, including other Hadoop
> daemons like the DataNode and TaskTracker. Thus, the node would become
> unusable for any Hadoop tasks. There should be a way to prevent such tasks
> from bringing down the node.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.