[
https://issues.apache.org/jira/browse/HADOOP-3581?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12626646#action_12626646
]
Hemanth Yamijala commented on HADOOP-3581:
------------------------------------------
Some comments:
{{ProcessTree}}:
- isImplemented need not throw an Exception. It can return false if something
fails, as the object can no longer be used.
- rename {{getVmem}} to something like {{getCumulativeVmem}} to better reflect
what it is doing.
{{ProcfsBasedProcessTree}}:
- The algorithm in {{initialize}} can be improved. Particularly, to construct
the process hierarchy, we are using a recursive mechanism which is looking at
paths in the process tree hierarchy multiple times. Instead, we could have one
pass to get the list of processes, and another to create the parent-child
relationship. Building the required tree will then be walking from the process
corresponding to the task, and listing its children recursively.
- In {{reconstruct}} we are removing the completed processes by adding that to
a delete list and then walking over it to delete one at a time. Can't we use
{{Iterator.remove}} to achieve what we want ?
- In {{reconstruct}} rather than creating a new HashMap to clear the elements,
we can directly call clear on the existing HashMap.
- SLEEP_TIME_BEFORE_SIGKILL should be made a configuration variable.
- Give a name to the SigKillThread.
{{TaskMemoryManagerThread}}
- MONITORING_INTERVAL should be configurable.
- Instead of using the Object[3] to store the Process related information, we
can use a simple private class to hold this information together.
- The {{processTreeInfo}} map should be between TIP and the object described
above.
- Use Configuration(false), which will not load the default configuration, when
setting the jobid/
- Use {{TaskTracker.getMemoryPerTask(TaskInProgress)}} instead of getting the
value from the JobConf.
- When walking over the tasks that are running, you must check if the task
state is running, or commit pending, and so on.
> Prevent memory intensive user tasks from taking down nodes
> ----------------------------------------------------------
>
> Key: HADOOP-3581
> URL: https://issues.apache.org/jira/browse/HADOOP-3581
> Project: Hadoop Core
> Issue Type: Improvement
> Components: mapred
> Reporter: Hemanth Yamijala
> Assignee: Vinod Kumar Vavilapalli
> Attachments: HADOOP-3581-final.txt, HADOOP-3581.6.0.txt,
> patch_3581_0.1.txt, patch_3581_3.3.txt, patch_3581_4.3.txt,
> patch_3581_4.4.txt, patch_3581_5.0.txt, patch_3581_5.2.txt
>
>
> Sometimes user Map/Reduce applications can get extremely memory intensive,
> maybe due to some inadvertent bugs in the user code, or the amount of data
> processed. When this happens, the user tasks start to interfere with the
> proper execution of other processes on the node, including other Hadoop
> daemons like the DataNode and TaskTracker. Thus, the node would become
> unusable for any Hadoop tasks. There should be a way to prevent such tasks
> from bringing down the node.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.