I have a Hadoop job that I have successfully run with an input set of about
50 million input  records. To test scaling and prepare for where we plan to
be a year or two from now I tried the same job with a about 4 times as many
records

most map tasks fail with the message
could not find any valid local directory for
tasktracker/jobcache/job.../jars

The first  job is writing about 4 TB and is running on an 0.23 cluster.

My general understanding is that this message occurs when a tmp directory
on local drive gets full.

I have requested that systems restart the cluster.

My questions are:
1) Are there commands to run on a slave to see the issue?
2) Will restarting the cluster clear things out and help?
3) Are there ways to tune the job to mitigate this issue?

Reply via email to