[ https://issues.apache.org/jira/browse/MAPREDUCE-1221?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838403#action_12838403 ]
Arun C Murthy commented on MAPREDUCE-1221: ------------------------------------------ Dhruba, this is not about making users write good code. This is about penalizing poorly behaved applications. If a MR job consumes too much memory you want to fail its component tasks and eventually fail the job. My concern is that the _current implementation_ of this feature is not doing that. To be clear, I'm not against tracking physical memory used by the process. I'm only proposing we need to penalize the applications who consume too much memory. To that affect I'm proposing we *fail* the task if it exceeds the limit. I prefer the per-task limit since it has has served us well with virtual memory. Maybe it is a bad idea to use the same model for physical memory, maybe some can help me understand why it is so. I'm happy to reconsider it then, I've asked Allen Wittnauer for his thoughts on this too. > Kill tasks on a node if the free physical memory on that machine falls below > a configured threshold > --------------------------------------------------------------------------------------------------- > > Key: MAPREDUCE-1221 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-1221 > Project: Hadoop Map/Reduce > Issue Type: Improvement > Components: tasktracker > Affects Versions: 0.22.0 > Reporter: dhruba borthakur > Assignee: Scott Chen > Fix For: 0.22.0 > > Attachments: MAPREDUCE-1221-v1.patch, MAPREDUCE-1221-v2.patch, > MAPREDUCE-1221-v3.patch > > > The TaskTracker currently supports killing tasks if the virtual memory of a > task exceeds a set of configured thresholds. I would like to extend this > feature to enable killing tasks if the physical memory used by that task > exceeds a certain threshold. > On a certain operating system (guess?), if user space processes start using > lots of memory, the machine hangs and dies quickly. This means that we would > like to prevent map-reduce jobs from triggering this condition. From my > understanding, the killing-based-on-virtual-memory-limits (HADOOP-5883) were > designed to address this problem. This works well when most map-reduce jobs > are Java jobs and have well-defined -Xmx parameters that specify the max > virtual memory for each task. On the other hand, if each task forks off > mappers/reducers written in other languages (python/php, etc), the total > virtual memory usage of the process-subtree varies greatly. In these cases, > it is better to use kill-tasks-using-physical-memory-limits. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.