I have had a common error come up now on two seperate fetches, both using the
new Hadoop 0.10.1. The first error came up on my regular fetch using my large
Nutch DB, but to rule out any problems with that (possibly related to the new
fetch statuses) i created a brand new DB using the standard DMOZ inject. Just
now that failed also, with the same error.
Here is the output:
2007-01-17 01:50:21,480 WARN mapred.LocalJobRunner - job_m5rew8
java.lang.NullPointerException
at
org.apache.hadoop.io.SequenceFile$Sorter$MergeQueue.merge(SequenceFile.java:2158)
at
org.apache.hadoop.io.SequenceFile$Sorter.merge(SequenceFile.java:1892)
at
org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:498)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:191)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:109)
2007-01-17 01:50:21,629 FATAL fetcher.Fetcher - Fetcher: java.io.IOException:
Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:402)
at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:469)
at org.apache.nutch.fetcher.Fetcher.run(Fetcher.java:504)
at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:189)
at org.apache.nutch.fetcher.Fetcher.main(Fetcher.java:476)
Its not failing right away upon fetch start but around 100k urls, my Hadoop map
file makes it to about 2.4GB.
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-developers mailing list
Nutch-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-developers