We're just now moving from a nutch .9 installation to 1.0, so I'm not
entirely new to this.  However, I can't even get past the first fetch now,
due to a hadoop error.

Looking in the mailing list archives, normally this error is caused from
either permissions or a full disk.  I overrode the use of /tmp by setting
hadoop.tmp.dir to a place with plenty of space, and I'm running the crawl
as root, yet I'm still getting the error below.

Any thoughts?

Running on AIX with plenty of disk and RAM.

2010-04-16 12:49:51,972 INFO  fetcher.Fetcher - -finishing thread
FetcherThread, activeThreads=0
2010-04-16 12:49:52,267 INFO  fetcher.Fetcher - -activeThreads=0,
spinWaiting=0, fetchQueues.totalSize=0
2010-04-16 12:49:52,268 INFO  fetcher.Fetcher - -activeThreads=0,
2010-04-16 12:49:52,270 WARN  mapred.LocalJobRunner - job_local_0005
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find any
valid local directory for
taskTracker/jobcache/job_local_0005/attempt_local_0005_m_000000_0/output/spill0.out
        at org.apache.hadoop.fs.LocalDirAllocator
$AllocatorPerContext.getLocalPathForWrite(LocalDirAllocator.java:335)
        at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathForWrite
(LocalDirAllocator.java:124)
        at org.apache.hadoop.mapred.MapOutputFile.getSpillFileForWrite
(MapOutputFile.java:107)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill
(MapTask.java:930)
        at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush
(MapTask.java:842)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
        at org.apache.hadoop.mapred.LocalJobRunner$Job.run
(LocalJobRunner.java:138)

Reply via email to