[ 
http://issues.apache.org/jira/browse/HADOOP-299?page=comments#action_12416131 ] 

Andrzej Bialecki  commented on HADOOP-299:
------------------------------------------

It's the same issue that I reported earlier on the mailing list (e.g. 
http://www.mail-archive.com/hadoop-dev@lucene.apache.org/msg01510.html, 
http://www.mail-archive.com/hadoop-dev@lucene.apache.org/msg01524.html and 
http://www.mail-archive.com/hadoop-dev@lucene.apache.org/msg01557.html).

Your patch, although it improves things, could perhaps go one step further if 
you have some time to spare... ;) I'm thinking specifically about the following:

* don't schedule reduce tasks from jobs, where map tasks had no chance of 
running yet. This happens when there are a couple available slots, and map 
tasks cannot be scheduled yet, but the code in JobTracker:743 still allocates 
reduce tasks from the next job.

* allow simple priority-based preemption, i.e. jobs with a higher priority 
(presumably short-lived) could be favored in task allocation over already 
running jobs.

* alternatively, allow setting limits on the min/max % of cluster capacity the 
job is willing to accept. This gives some leeway to the scheduler to allocate 
the flexible portion of remaining tasks to old/new jobs.

* allow setting different priorities for maps and reduces - e.g. in Nutch 
fetcher job, map tasks are very long running and in many cases need to fit 
within a specified time-frame (e.g. during the night). However, reduce tasks, 
which simply reshuffle the data, are not so time-critical.


> maps from second jobs will not run until the first job finishes completely
> --------------------------------------------------------------------------
>
>          Key: HADOOP-299
>          URL: http://issues.apache.org/jira/browse/HADOOP-299
>      Project: Hadoop
>         Type: Bug

>   Components: mapred
>     Versions: 0.3.2
>     Reporter: Owen O'Malley
>     Assignee: Owen O'Malley
>      Fix For: 0.4.0
>  Attachments: map-schedule.patch
>
> Because of the logic in the JobTracker's pollForNewTask, second jobs will 
> rarely start running maps until the first job finishes completely. The 
> JobTracker leaves room to re-run failed maps from the first job and it 
> reserves the total number of maps for the first job. Thus, if you have more 
> maps in the first job than your cluster capacity, none of the second job maps 
> will ever run.
> I propose setting the reserve to 1% of the first job's maps.

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
For more information on JIRA, see:
   http://www.atlassian.com/software/jira

Reply via email to