[jira] Updated: (HADOOP-4981) Prior code fix in Capacity Scheduler prevents speculative execution in jobs

Sreekanth Ramakrishnan (JIRA) Fri, 15 May 2009 06:55:12 -0700

     [ 
https://issues.apache.org/jira/browse/HADOOP-4981?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Sreekanth Ramakrishnan updated HADOOP-4981:
-------------------------------------------

    Attachment: HADOOP-4981-4.patch

Attaching new patch fixing test case issues:

Currently following is steps used to test out  high ram jobs with speculative 
execution.
- Configure two task trackers each having 3 GB vmem and 1 GB physical memory 
and 2 maps and 2 reduce slots each.
- Submit one high memory job which has 2 GB vmem task requirement which has a 
map task and a speculative map task and zero reduce.
- Submit another job which has low memory requirement of 100 MB vmem which has 
a map task and no reduce task.
- First map from first job would be scheduled on tt1.
- Check that cluster is blocked until, speculative task of high memory job is 
scheduled.
- Once high memory job's speculative map is scheduled, the map task of normal 
job should be scheduled.

- Now submit high memory job which has 2 GB vme requirement and has 1 map and 1 
reduce with a speculative reduce.
- Now submit a normal job which has memory requirement of 100 MB.
- First high memory job's map is scheduled on tt1.
- When tt1 gets back to scheduler asking for a task, it is blocked because the 
scheduler would try to assign a task from job3 which is a high memory job and 
it has already taken the map slot in the tracker, so there would not be any 
space.
- When tt2 gets back to the scheduler asking for a task, it would be assigned 
map from job4.
- when tt2 gets back again to scheduler, then reduce of job3 would start 
running.
- Now cluster is blocked till speculative reduce of job3 runs.
- Finish map tasks.
- Make tt1 come to scheduler, it would be assigned speculative reduce task of 
job3.
- Make tt1 come back to scheduler,. it would be assigned normal reduce task of 
job4 as it would fit the memory.


The {{FakeJobInProgress}} resets the {{hasSpeculativeMap}} and 
{{hasSpeculativeReduce}} fields in {{FakeJobInProgress}} after a speculative 
task is given to the scheduler.

> Prior code fix in Capacity Scheduler prevents speculative execution in jobs
> ---------------------------------------------------------------------------
>
>                 Key: HADOOP-4981
>                 URL: https://issues.apache.org/jira/browse/HADOOP-4981
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: contrib/capacity-sched
>            Reporter: Vivek Ratan
>         Attachments: 4981.1.patch, 4981.2.patch, HADOOP-4981-1.patch, 
> HADOOP-4981-2.patch, HADOOP-4981-3.patch, HADOOP-4981-4.patch
>
>
> As part of the code fix for HADOOP-4035, the Capacity Scheduler obtains a 
> task from JobInProgress (calling obtainNewMapTask() or obtainNewReduceTask()) 
> only if the number of pending tasks for a job is greater than zero (see the 
> if-block in TaskSchedulingMgr.getTaskFromJob()). So, if a job has no pending 
> tasks and only has running tasks, it will never be given a slot, and will 
> never have a chance to run a speculative task. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (HADOOP-4981) Prior code fix in Capacity Scheduler prevents speculative execution in jobs

Reply via email to