[ https://issues.apache.org/jira/browse/HADOOP-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644096#action_12644096 ]
Owen O'Malley commented on HADOOP-4035: --------------------------------------- I guess I'm ok with it as a delta from total virtual memory, although how to detect the virtual memory in a generic manner is an interesting question. Maybe as I proposed over in HADOOP-4523, we need a plugin that could provide OS-specific/site functionality. Note that if we are using virtual memory, then we absolutely need a different configuration for the amount of virtual memory that we'd like to schedule to. We do not *want* the scheduler to put 4 10G tasks on a machine with 8G ram and 32G swap. That number should be based on RAM. So, I'd propose that we extend the plugin interface as: {code} public abstract class MemoryPlugin { public abstract long getVirtualMemorySize(Configuration conf); public abstract long getRamSize(Configuration conf); } {code} I'd propose that these values be the real values and that we have a configured offset for both values. mapred.tasktracker.virtualmemory.reserved (subtracted off of virtual memory) mapred.tasktracker.memory.reserved (subtracted off of physical ram, before reporting to JT) Jobs should then define a soft and hard limit for their memory usage. If a task goes over the hard limit, it should be killed immediately. The scheduler should only allocate tasks if sum(soft limits of tasks) <= TT ram sum(hard limits of tasks) <= TT virtual memory Thoughts? > Modify the capacity scheduler (HADOOP-3445) to schedule tasks based on memory > requirements and task trackers free memory > ------------------------------------------------------------------------------------------------------------------------ > > Key: HADOOP-4035 > URL: https://issues.apache.org/jira/browse/HADOOP-4035 > Project: Hadoop Core > Issue Type: Bug > Components: contrib/capacity-sched > Affects Versions: 0.19.0 > Reporter: Hemanth Yamijala > Assignee: Vinod K V > Priority: Blocker > Fix For: 0.20.0 > > Attachments: 4035.1.patch, HADOOP-4035-20080918.1.txt, > HADOOP-4035-20081006.1.txt, HADOOP-4035-20081006.txt, HADOOP-4035-20081008.txt > > > HADOOP-3759 introduced configuration variables that can be used to specify > memory requirements for jobs, and also modified the tasktrackers to report > their free memory. The capacity scheduler in HADOOP-3445 should schedule > tasks based on these parameters. A task that is scheduled on a TT that uses > more than the default amount of memory per slot can be viewed as effectively > using more than one slot, as it would decrease the amount of free memory on > the TT by more than the default amount while it runs. The scheduler should > make the used capacity account for this additional usage while enforcing > limits, etc. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.