[ 
https://issues.apache.org/jira/browse/HADOOP-4035?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12638574#action_12638574
 ] 

Hemanth Yamijala commented on HADOOP-4035:
------------------------------------------

Some comments:

JobConf:
- I think it is OK to expose whether memory based scheduling is enabled as an 
API.

CapacityTaskScheduler:
- {{jobFitsOnTT}}: if job has not requested for any memory, we promise it 
atleast defaultMemoryPerSlot on TT. So, I think this method should still check 
for that part.
- Since we already have a map/reduce based {{TaskSchedulingMgr}}, can we 
implement {{jobFitsOnTT}} to not have checks based on whether it's map or 
reduce task ? One way to do that would be to define an abstract 
{{getFreeVirtualMemoryForTask()}} in {{TaskSchedulingMgr}} and implement it in 
the {{MapSchedulingMgr}} to return 
{{resourceStatus.getFreeVirtualMemoryForMaps()}} and so on.
- {{InAdequateResourcesException}} should be {{InadequateResourcesException}}. 
Does it need to extend IOException ?
- {{updateResourcesInformation}}: If for any one TT there is 
DISABLED_VIRTUAL_MEMORY_LIMIT, we don't need to proceed in the loop - a small 
optimization ? 
- Also, this need not be done if memory management is disabled.
- jip.isKillInProgress() -- I think this is going to be changed. Will this 
trigger {{jobCompleted}} events ? This should be checked with the solution of 
HADOOP-4053.

- Can we somehow avoid duplicating the following code between 
{{CapacityTaskScheduler}} and {{JobQueueTaskScheduler}}:
-- jobFitsOnTT
-- updateResourcesInformation()
-- killing of jobs
It is significant logic and avoiding code duplication might help.

I need to review the changes to the testcases still.






> Modify the capacity scheduler (HADOOP-3445) to schedule tasks based on memory 
> requirements and task trackers free memory
> ------------------------------------------------------------------------------------------------------------------------
>
>                 Key: HADOOP-4035
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4035
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>    Affects Versions: 0.19.0
>            Reporter: Hemanth Yamijala
>            Assignee: Vinod K V
>            Priority: Blocker
>             Fix For: 0.19.0
>
>         Attachments: 4035.1.patch, HADOOP-4035-20080918.1.txt, 
> HADOOP-4035-20081006.1.txt, HADOOP-4035-20081006.txt, HADOOP-4035-20081008.txt
>
>
> HADOOP-3759 introduced configuration variables that can be used to specify 
> memory requirements for jobs, and also modified the tasktrackers to report 
> their free memory. The capacity scheduler in HADOOP-3445 should schedule 
> tasks based on these parameters. A task that is scheduled on a TT that uses 
> more than the default amount of memory per slot can be viewed as effectively 
> using more than one slot, as it would decrease the amount of free memory on 
> the TT by more than the default amount while it runs. The scheduler should 
> make the used capacity account for this additional usage while enforcing 
> limits, etc.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to