[ 
https://issues.apache.org/jira/browse/HADOOP-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644096#action_12644096
 ] 

Owen O'Malley commented on HADOOP-4035:
---------------------------------------

I guess I'm ok with it as a delta from total virtual memory, although how to 
detect the virtual memory in a generic manner is an interesting question. Maybe 
as I proposed over in HADOOP-4523, we need a plugin that could provide 
OS-specific/site functionality.

Note that if we are using virtual memory, then we absolutely need a different 
configuration for the amount of virtual memory that we'd like to schedule to. 
We do not *want* the scheduler to put 4 10G tasks on a machine with 8G ram and 
32G swap. That number should be based on RAM. So, I'd propose that we extend 
the plugin interface as:

{code}
public abstract class MemoryPlugin {
  public abstract long getVirtualMemorySize(Configuration conf);
  public abstract long getRamSize(Configuration conf);
}
{code}

I'd propose that these values be the real values and that we have a configured 
offset for both values.

mapred.tasktracker.virtualmemory.reserved (subtracted off of virtual memory)
mapred.tasktracker.memory.reserved (subtracted off of physical ram, before 
reporting to JT)

Jobs should then define a soft and hard limit for their memory usage. If a task 
goes over the hard limit, it should be killed immediately.

The scheduler should only allocate tasks if
  sum(soft limits of tasks) <= TT ram
  sum(hard limits of tasks) <= TT virtual memory

Thoughts?

> Modify the capacity scheduler (HADOOP-3445) to schedule tasks based on memory 
> requirements and task trackers free memory
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4035
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4035
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Hemanth Yamijala
>            Assignee: Vinod K V
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: 4035.1.patch, HADOOP-4035-20080918.1.txt, 
> HADOOP-4035-20081006.1.txt, HADOOP-4035-20081006.txt, HADOOP-4035-20081008.txt
>
>
> HADOOP-3759 introduced configuration variables that can be used to specify 
> memory requirements for jobs, and also modified the tasktrackers to report 
> their free memory. The capacity scheduler in HADOOP-3445 should schedule 
> tasks based on these parameters. A task that is scheduled on a TT that uses 
> more than the default amount of memory per slot can be viewed as effectively 
> using more than one slot, as it would decrease the amount of free memory on 
> the TT by more than the default amount while it runs. The scheduler should 
> make the used capacity account for this additional usage while enforcing 
> limits, etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to