RE: [jira] Commented: (NUTCH-266) hadoop bug when doing updatedb

Teruhiko Kurosaka Thu, 22 Jun 2006 11:12:17 -0700

Thank you for your reply, Sami.

> >I am not intend to run hadoop at all, so this 
> hadoop-site.xlm is empty.
...
> You should at least set values for 'mapred.system.dir' and
'mapred.local.dir'
> and point them to a dir that has enough space available (I think they 
> default to under /tmp at least on my system wich is far too small for 
> larger jobs)


OK, I just copied the definitions for these properties from
hadoop-default.xml 
and prepended "C:" to each value so that they really refer to C:\tmp. 
C: has 65 GB free space and this practice crawl crawls a directory that
contain 20 documents with total byte count less than 10 MB. So I figure
C: has more than adequate free space.

But I've still got the same error:
2006-06-22 10:54:01,548 WARN  mapred.LocalJobRunner
(LocalJobRunner.java:run(119)) - job_x5jmir
java.io.IOException: Couldn't rename
C:/tmp/hadoop/mapred/local/map_ye7oza/part-0.out
        at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:102)
Exception in thread "main" java.io.IOException: Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:342)
        at org.apache.nutch.crawl.CrawlDb.update(CrawlDb.java:55)
        at org.apache.nutch.crawl.Crawl.main(Crawl.java:114)

After the nutch exited, I checked the directory;
C:/tmp/hadoop/mapred/local/map_ye7oza/
does exist but there was not a file called part-0.out.  The directory
was empty.

I'd appreciate any other suggestions you might have.

-kuro

RE: [jira] Commented: (NUTCH-266) hadoop bug when doing updatedb

Reply via email to