No Christian, there is currently no way to do it. They would get cleaned up only at the end of the job. Could you please raise a jira. We can discuss this topic there.. Thanks, Devaraj
On 11/13/08 10:21 PM, "Christian Kunz" <[EMAIL PROTECTED]> wrote: > We are running a job using up more than 80% of available dfs space. > > Many reduce tasks fail because of write errors and there seems to be a > negative feedback, because the space used by failed tasks is not cleaned up, > making write errors of new tasks even more likely. > > E.g. We have 9,000 reduce tasks, 5,500 completed, and the remaining 3,500 > failing repeatedly, because the dfs space required for the remaining tasks > is basically taken away to a good percentage by the failed tasks. > > Is there a configuration option to get the temporary directories of failed > tasks get cleaned up? > > Thanks, > Christian > > Sample output of 'hadoop fs -du': > .../_temporary/_task_200811051109_0002_r_000080_0 8589934731 > .../_temporary/_task_200811051109_0002_r_000080_1 8455717003 > .../_temporary/_task_200811051109_0002_r_000080_2 5771362443 > .../_temporary/_task_200811051109_0002_r_000080_3 7784628363 >
