Sorry for not including that, this is the final output from the hadoop.log
file on the reduce stage:
2006-12-14 03:43:10,316 INFO plugin.PluginRepository - Plugins: looking in:
/usr/local/nutch/build/nutch-0.9-dev/plugins
2006-12-14 03:43:10,650 INFO plugin.PluginRepository - Plugin
Auto-activation mode: [true]
2006-12-14 03:43:10,651 INFO plugin.PluginRepository - Registered Plugins:
2006-12-14 03:43:10,651 INFO plugin.PluginRepository - CyberNeko
HTML Parser (lib-nekohtml)
2006-12-14 03:43:10,651 INFO plugin.PluginRepository - Site Query
Filter (query-site)
2006-12-14 03:43:10,651 INFO plugin.PluginRepository - Basic URL
Normalizer (urlnormalizer-basic)
2006-12-14 03:43:10,651 INFO plugin.PluginRepository - Html Parse
Plug-in (parse-html)
2006-12-14 03:43:10,651 INFO plugin.PluginRepository - Pass-through
URL Normalizer (urlnormalizer-pass)
2006-12-14 03:43:10,651 INFO plugin.PluginRepository - Regex URL
Filter Framework (lib-regex-filter)
2006-12-14 03:43:10,651 INFO plugin.PluginRepository - Basic
Indexing Filter (index-basic)
2006-12-14 03:43:10,651 INFO plugin.PluginRepository - Basic
Summarizer Plug-in (summary-basic)
2006-12-14 03:43:10,651 INFO plugin.PluginRepository - Text Parse
Plug-in (parse-text)
2006-12-14 03:43:10,651 INFO plugin.PluginRepository - JavaScript
Parser (parse-js)
2006-12-14 03:43:10,651 INFO plugin.PluginRepository - Regex URL
Filter (urlfilter-regex)
2006-12-14 03:43:10,651 INFO plugin.PluginRepository - Basic Query
Filter (query-basic)
2006-12-14 03:43:10,651 INFO plugin.PluginRepository - HTTP
Framework (lib-http)
2006-12-14 03:43:10,651 INFO plugin.PluginRepository - URL Query
Filter (query-url)
2006-12-14 03:43:10,651 INFO plugin.PluginRepository - Regex URL
Normalizer (urlnormalizer-regex)
2006-12-14 03:43:10,651 INFO plugin.PluginRepository - Http
Protocol Plug-in (protocol-http)
2006-12-14 03:43:10,651 INFO plugin.PluginRepository - the nutch
core extension points (nutch-extensionpoints)
2006-12-14 03:43:10,651 INFO plugin.PluginRepository - OPIC Scoring
Plug-in (scoring-opic)
2006-12-14 03:43:10,651 INFO plugin.PluginRepository - Registered
Extension-Points:
2006-12-14 03:43:10,651 INFO plugin.PluginRepository - Nutch
Summarizer (org.apache.nutch.searcher.Summarizer)
2006-12-14 03:43:10,651 INFO plugin.PluginRepository - Nutch
Scoring (org.apache.nutch.scoring.ScoringFilter)
2006-12-14 03:43:10,651 INFO plugin.PluginRepository - Nutch
Protocol (org.apache.nutch.protocol.Protocol)
2006-12-14 03:43:10,651 INFO plugin.PluginRepository - Nutch URL
Normalizer (org.apache.nutch.net.URLNormalizer)
2006-12-14 03:43:10,651 INFO plugin.PluginRepository - Nutch URL
Filter (org.apache.nutch.net.URLFilter)
2006-12-14 03:43:10,651 INFO plugin.PluginRepository - HTML Parse
Filter (org.apache.nutch.parse.HtmlParseFilter)
2006-12-14 03:43:10,651 INFO plugin.PluginRepository - Nutch Online
Search Results Clustering Plugin (org.apache.nut
ch.clustering.OnlineClusterer)
2006-12-14 03:43:10,651 INFO plugin.PluginRepository - Nutch
Indexing Filter (org.apache.nutch.indexer.IndexingFilte
r)
2006-12-14 03:43:10,651 INFO plugin.PluginRepository - Nutch
Content Parser (org.apache.nutch.parse.Parser)
2006-12-14 03:43:10,651 INFO plugin.PluginRepository - Ontology
Model Loader (org.apache.nutch.ontology.Ontology)
2006-12-14 03:43:10,651 INFO plugin.PluginRepository - Nutch
Analysis (org.apache.nutch.analysis.NutchAnalyzer)
2006-12-14 03:43:10,651 INFO plugin.PluginRepository - Nutch Query
Filter (org.apache.nutch.searcher.QueryFilter)
2006-12-14 03:43:10,698 WARN mapred.LocalJobRunner - job_xv8awo
java.lang.NoSuchMethodError:
org.apache.hadoop.io.MapFile$Writer.<init>(Lorg/apache/hadoop/conf/Configura
tion;Lorg/apache/had
oop/fs/FileSystem;Ljava/lang/String;Ljava/lang/Class;Ljava/lang/Class;)V
at
org.apache.nutch.parse.ParseOutputFormat.getRecordWriter(ParseOutputFormat.j
ava:74)
at
org.apache.nutch.fetcher.FetcherOutputFormat$1.<init>(FetcherOutputFormat.ja
va:72)
at
org.apache.nutch.fetcher.FetcherOutputFormat.getRecordWriter(FetcherOutputFo
rmat.java:61)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:258)
at
org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:130)
2006-12-14 03:43:11,038 FATAL fetcher.Fetcher - Fetcher:
java.io.IOException: Job failed!
at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:393)
at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:445)
at org.apache.nutch.fetcher.Fetcher.run(Fetcher.java:480)
at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:187)
at org.apache.nutch.fetcher.Fetcher.main(Fetcher.java:452)
Also note, the file system was never in danger of becoming full so it wasn't
a free space issue. I hope this information helps everyone out more then the
previous. Thanks.
-----Original Message-----
From: Andrzej Bialecki [mailto:[EMAIL PROTECTED]
Sent: Thursday, December 14, 2006 5:27 AM
To: [email protected]
Subject: Re: error with trunk: linkdb copied to wrong dir
Sean Dean wrote:
> Hello,
>
> In an attempt to run my regular scheduled fetch, and to also see if I
could reproduce your error it seems I may have found another one. The
following procedure was done with the most recent trunk version, including
the Hadoop-0.9 update.
>
> bin/nutch generate crawl/crawldb crawl/segments -topN 1000000
>
> This command was successful, fetch list was generated without error.
>
> bin/nutch fetch crawl/segments/20061211105651
>
> Actual fetch was successful, but the reduce stage failed. Error output
was:
>
> Fetcher: java.io.IOException: Job failed!
> at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:393)
> at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:445)
> at org.apache.nutch.fetcher.Fetcher.run(Fetcher.java:480)
> at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:187)
> at org.apache.nutch.fetcher.Fetcher.main(Fetcher.java:452)
>
>
This stacktrace doesn't tell anything except that the job failed ...
Please find the corresponding entries in tasktracker's log, they should
provide more details regarding the reason for this failure. You may also
wish to increase the log level to get more details in logs.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general