On Feb 13, 2008, at 9:39 AM, Jaideep Dhok wrote:
Hi,AFAIK right now they are only doing FCFS scheduling. You can read the codein org.apache.hadoop.mapred.JobTracker.java. I think the code in "getNewTaskForTaskTracker" method.
It actually prefers a host where the data is local. On large jobs, we get 95% of the maps running locally (assuming hdfs cluster == map/ reduce cluster). You also adding rack locality into the scheduling in HADOOP-1985.
-- Owen