[jira] Commented: (HADOOP-4035) Modify the capacity scheduler (HADOOP-3445) to schedule tasks based on memory requirements and task trackers free memory

Hemanth Yamijala (JIRA) Mon, 03 Nov 2008 00:45:10 -0800

    [ 
https://issues.apache.org/jira/browse/HADOOP-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12644660#action_12644660
 ]


Hemanth Yamijala commented on HADOOP-4035:
------------------------------------------

bq. Should the memory-related config values be expressed in MB or GB or KB or 
just bytes? MB sounds good to me.
The other parameter we have in hadoop related to memory is mapred.child.ulimit 
which is specified in KB. I think expressing these values in KB would keep 
things consistent.

bq. If a job's specified VM or RAM task limit is higher than the max limit, 
that job shouldn't be allowed to run. Should the JT reject the job when it is 
submitted, or should the scheduler do it, by failing the job?
I think scheduler failing the job is more consistent if the scheduling 
decisions are being made in the scheduler.

bq. Should the Capacity Scheduler use the entire RAM of a TT when making a 
scheduling decision, or an offset?
I am not really sure either way. Given earlier discussions we've had that 
virtual memory is what really matters, I am guessing we don't need it.

Regarding the config variable names, a few concerns/suggestions:
- mapred.tasktracker.virtualmemory.reserved: This seems like specifying the 
amount of memory reserved for Hadoop, whereas it means the opposite. Can we 
call it mapred.tasktracker.vmem.excluded ?
- We are using 'virtualmemory', and 'vm' to represent virtual memory. Should we 
consistently name it as 'vmem' everywhere ?
- Similar to excluded, rename variables to mapred.task.maxvmem.default and 
mapred.task.maxvmem.limit ?

Does this make sense ?

> Modify the capacity scheduler (HADOOP-3445) to schedule tasks based on memory 
> requirements and task trackers free memory
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4035
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4035
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Hemanth Yamijala
>            Assignee: Vinod K V
>            Priority: Blocker
>             Fix For: 0.20.0
>
>         Attachments: 4035.1.patch, HADOOP-4035-20080918.1.txt, 
> HADOOP-4035-20081006.1.txt, HADOOP-4035-20081006.txt, HADOOP-4035-20081008.txt
>
>
> HADOOP-3759 introduced configuration variables that can be used to specify 
> memory requirements for jobs, and also modified the tasktrackers to report 
> their free memory. The capacity scheduler in HADOOP-3445 should schedule 
> tasks based on these parameters. A task that is scheduled on a TT that uses 
> more than the default amount of memory per slot can be viewed as effectively 
> using more than one slot, as it would decrease the amount of free memory on 
> the TT by more than the default amount while it runs. The scheduler should 
> make the used capacity account for this additional usage while enforcing 
> limits, etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HADOOP-4035) Modify the capacity scheduler (HADOOP-3445) to schedule tasks based on memory requirements and task trackers free memory

Reply via email to