[ 
https://issues.apache.org/jira/browse/HADOOP-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12613622#action_12613622
 ] 

Vinod Kumar Vavilapalli commented on HADOOP-3581:
-------------------------------------------------

bq. Could we solve this by adding an extra argument specifying the JobId and 
the UserId to enable the script to do by job/user accounting ?
I am not sure if I understand this well enough. If you meant "pass JobId/UserId 
to the script and do per-job/per-user accounting only", then that won't help - 
we need overall accounting across all tasks.

bq. The wrapper I proposed before could solve this problem as a side effect 
(with /etc/security/limits.conf). But it might not be portable and your 
solution is maybe for this case.
limits.conf approach is already being evaluated, it doesn't solve the current 
problem. See this comment on this very JIRA - 
https://issues.apache.org/jira/browse/HADOOP-3581?focusedCommentId=12607650#action_12607650

bq. I'm afraid that many functionality will not to be available for threaded 
tasks anyway. My next proposition will include a fallback mecanism so you 
should'nt have to take this in account.
This looks like an interesting problem - how do we manage resource usage by 
each thread? Any thread resource management support in Java? What is the 
use-case for threaded tasks in the first place? If cost of per-taskJvm is the 
only reason why we want to run each task in a thread instead of a jvm, we can 
still achieve resource management of all tasks by forking one single jvm and 
running all tasks as threads of this jvm. This way we can meet our objective 
here too - shield hadoop from user code.

> Prevent memory intensive user tasks from taking down nodes
> ----------------------------------------------------------
>
>                 Key: HADOOP-3581
>                 URL: https://issues.apache.org/jira/browse/HADOOP-3581
>             Project: Hadoop Core
>          Issue Type: Improvement
>          Components: mapred
>            Reporter: Hemanth Yamijala
>            Assignee: Vinod Kumar Vavilapalli
>         Attachments: patch_3581_0.1.txt
>
>
> Sometimes user Map/Reduce applications can get extremely memory intensive, 
> maybe due to some inadvertent bugs in the user code, or the amount of data 
> processed. When this happens, the user tasks start to interfere with the 
> proper execution of other processes on the node, including other Hadoop 
> daemons like the DataNode and TaskTracker. Thus, the node would become 
> unusable for any Hadoop tasks. There should be a way to prevent such tasks 
> from bringing down the node.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to