[
https://issues.apache.org/jira/browse/HADOOP-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12613622#action_12613622
]
Vinod Kumar Vavilapalli commented on HADOOP-3581:
-------------------------------------------------
bq. Could we solve this by adding an extra argument specifying the JobId and
the UserId to enable the script to do by job/user accounting ?
I am not sure if I understand this well enough. If you meant "pass JobId/UserId
to the script and do per-job/per-user accounting only", then that won't help -
we need overall accounting across all tasks.
bq. The wrapper I proposed before could solve this problem as a side effect
(with /etc/security/limits.conf). But it might not be portable and your
solution is maybe for this case.
limits.conf approach is already being evaluated, it doesn't solve the current
problem. See this comment on this very JIRA -
https://issues.apache.org/jira/browse/HADOOP-3581?focusedCommentId=12607650#action_12607650
bq. I'm afraid that many functionality will not to be available for threaded
tasks anyway. My next proposition will include a fallback mecanism so you
should'nt have to take this in account.
This looks like an interesting problem - how do we manage resource usage by
each thread? Any thread resource management support in Java? What is the
use-case for threaded tasks in the first place? If cost of per-taskJvm is the
only reason why we want to run each task in a thread instead of a jvm, we can
still achieve resource management of all tasks by forking one single jvm and
running all tasks as threads of this jvm. This way we can meet our objective
here too - shield hadoop from user code.
> Prevent memory intensive user tasks from taking down nodes
> ----------------------------------------------------------
>
> Key: HADOOP-3581
> URL: https://issues.apache.org/jira/browse/HADOOP-3581
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Hemanth Yamijala
> Assignee: Vinod Kumar Vavilapalli
> Attachments: patch_3581_0.1.txt
>
>
> Sometimes user Map/Reduce applications can get extremely memory intensive,
> maybe due to some inadvertent bugs in the user code, or the amount of data
> processed. When this happens, the user tasks start to interfere with the
> proper execution of other processes on the node, including other Hadoop
> daemons like the DataNode and TaskTracker. Thus, the node would become
> unusable for any Hadoop tasks. There should be a way to prevent such tasks
> from bringing down the node.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.