Hello,

In an attempt to run my regular scheduled fetch, and to also see if I could 
reproduce your error it seems I may have found another one. The following 
procedure was done with the most recent trunk version, including the Hadoop-0.9 
update.

bin/nutch generate crawl/crawldb crawl/segments -topN 1000000

This command was successful, fetch list was generated without error.

bin/nutch fetch crawl/segments/20061211105651

Actual fetch was successful, but the reduce stage failed. Error output was:

Fetcher: java.io.IOException: Job failed!
        at org.apache.hadoop.mapred.JobClient.runJob(JobClient.java:393)
        at org.apache.nutch.fetcher.Fetcher.fetch(Fetcher.java:445)
        at org.apache.nutch.fetcher.Fetcher.run(Fetcher.java:480)
        at org.apache.hadoop.util.ToolBase.doMain(ToolBase.java:187)
        at org.apache.nutch.fetcher.Fetcher.main(Fetcher.java:452)

I will svn back to the revision pre-hadoop upgrade and re-run the fetch, but im 
expecting it to work without error as it did before this update. I will report 
back with my findings.

My system details, for debugging purposes:

link# uname -a
FreeBSD link.enhancededge.com 6.2-PRERELEASE FreeBSD 6.2-PRERELEASE #0: Sun Nov 
12 06:09:44 EST 2006     [EMAIL PROTECTED]:/usr/obj/usr/src/sys/LINK  amd64

link# java -version
java version "1.5.0"
Java(TM) 2 Runtime Environment, Standard Edition (build diablo-1.5.0-b01)
Java HotSpot(TM) 64-Bit Server VM (build diablo-1.5.0_07-b01, mixed mode)

link# svn update
At revision 487147.

link# cat hadoop-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
  <name>hadoop.tmp.dir</name>
  <value>/usr/local/nutch/build/nutch-0.9-dev/temp/hadoop-${user.name}</value>
  <description>Hadoop temp directory</description>
</property>
</configuration>
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to