Look at how many job_ directories there are on your slave nodes. We're
using Cloudera so they are under the 'userlogs' directory, not sure on
'pure' Apache where they are.

As we approach 30k we see this.(we run a monthly report does 10s of
thousands of jobs in a few days) We've tried tuning the # of jobs stored in
the history on the jobtracker but it doesn't always help. So we have an
hourly cron job that finds any files older than 4 hours in that directory
and removes them. None of our individual jobs runs for more than 30
minutes, so waiting 4 hours and blowing them away hasn't caused us any
problems.



On Thu, Apr 26, 2012 at 5:17 AM, JunYong Li <lij...@gmail.com> wrote:

> maybe exists file hole, are -sh and du -sch /tmp results same?
>

Reply via email to