Re: guaranteeing disk space?

Owen O'Malley Mon, 15 Sep 2008 13:59:22 -0700


On Sep 15, 2008, at 11:24 AM, Kayla Jay wrote:

How does one do a check or guarantee there's enough disk space whenrunning a hadoop job that you're not sure how much it will producein its results (temp files, etc) ?

In 0.19 there is new code that waits until the first N% of maps arerun and estimates the amount of space required for each of thefollowing tasks. You can see the discussion here:


https://issues.apache.org/jira/browse/HADOOP-657

The task tracker can also set the mapred.local.dir.minspacestartvariable, which controls the minimum amount of disk space that must befree before it will ask for a new task.

Or, what if you run out of disk space on the HDFS if you are runninglarge jobs with large outputs ? The job just fails .. but how canone assess this resource allocation of disk space while runningyour jobs?

Map/Reduce works by re-executing tasks that fail, including tasks thatfail for lack of disk space. If the task fails, the partial resultsare erased on the assumption that they will be run later. The tasksthat finish, will have their output in the output directory, even ifthe job fails.


-- Owen

Re: guaranteeing disk space?

Reply via email to