KuroSaka TeruHiko (JIRA) wrote:
[ http://issues.apache.org/jira/browse/NUTCH-266?page=comments#action_12416945 ]
KuroSaka TeruHiko commented on NUTCH-266:
-----------------------------------------
I am experiencing pretty much the same symptom with the nighly builds of
5/31/2006 up to 6/14/2006, which I tested the last time.
Here's the result of my "nutch crawl" run with DEBUG level log turned on.
2006-06-16 17:04:05,932 INFO mapred.LocalJobRunner
(LocalJobRunner.java:progress(140)) -
C:/opt/nutch-060614/test/index/segments/20060616170358/crawl_parse/part-00000:0+62
2006-06-16 17:04:05,948 WARN mapred.LocalJobRunner
(LocalJobRunner.java:run(119)) - job_4wsxze
java.io.IOException: Couldn't rename
/tmp/hadoop/mapred/local/map_5n5aid/part-0.out
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:102)
Exception in thread "main" java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:342)
at org.apache.nutch.crawl.CrawlDb.update(CrawlDb.java:55)
at org.apache.nutch.crawl.Crawl.main(Crawl.java:114)
Prior to this fatal exception, I've seen many occurances of this exception:
2006-06-16 17:04:05,854 INFO conf.Configuration
(Configuration.java:loadResource(397)) - parsing
file:/C:/opt/nutch-060614/conf/hadoop-site.xml
<snip>
This isn't really an exception, it's there just to print the stacktrace (so one
can track
who is calling it).
I am not intend to run hadoop at all, so this hadoop-site.xlm is empty.
It just has this empty element:
<configuration>
</configuration>
You should at least set values for 'mapred.system.dir' and
'mapred.local.dir'
and point them to a dir that has enough space available (I think they
default
to under /tmp at least on my system wich is far too small for larger jobs)
--
Sami Siren