[ http://issues.apache.org/jira/browse/HADOOP-27?page=all ]
Johan Oskarson updated HADOOP-27: --------------------------------- Attachment: minspace_mapredtask.patch To begin with, disregard my last comment, I have no idea what I was thinking :) Created this patch to stop a job from losing the map tasks already completed when a task tracker runs out of space. Instead, the task tracker will stop accepting new tasks if the space runs below a certain threshold. However, if for some reason space clears up it will start accepting tasks again. If for example a reduce operation previously have been assigned to the task tracker there's a chance it will run out of space anyway. So if the tracker goes below a second threshold it will completely stop accepting new tasks until the job is done and also kill the reduce operation running, or if none is found a map task. It will try to take the one with the least progress. The solution might not be ideal, but it's at least better then having the job fail all the time because the task trackers drop off one by one. Suggestions are of course welcome. I've tested this on our tiny cluster and it seems to work fine, just saved me a couple of hours of redundant computation on a big job /Johan > MapRed tries to allocate tasks to nodes that have no available disk space > ------------------------------------------------------------------------- > > Key: HADOOP-27 > URL: http://issues.apache.org/jira/browse/HADOOP-27 > Project: Hadoop > Type: Bug > Components: mapred > Reporter: Mike Cafarella > Attachments: minspace_mapredtask.patch > > What it says above. MapRed TaskTrackers should not offer task service if > the local disk > space is too constrained. -- This message is automatically generated by JIRA. - If you think it was sent incorrectly contact one of the administrators: http://issues.apache.org/jira/secure/Administrators.jspa - For more information on JIRA, see: http://www.atlassian.com/software/jira