Hi All
When I migrate nutch from windows to linux, some errors come.
See the log below:
-------------------------------------------------------------------------------------
Apr 29, 2005 3:30:00 PM org.apache.nutch.web.CrawlJobAdapter execute
INFO: Job:CrawlJobs.CrawlJob executing @[Fri Apr 29 15:30:00 CST 2005]
050429 153000 %nutch: -local
/opt/tomcat/tomcat-nutch/nutch-ccs/WEB-INF/classes/seed.txt -dir
/opt/tomcat/tomcat-nutch/tomcat-nutch-5.0.19/bin/nutch-tmp/nutchcrawl-20050429153000
-depth 10 -showThreadID
050429 153000 parsing
file:/opt/tomcat/tomcat-nutch/nutch-ccs/WEB-INF/classes/nutch-default.xml
050429 153000 parsing
file:/opt/tomcat/tomcat-nutch/nutch-ccs/WEB-INF/classes/crawl-tool.xml
050429 153000 parsing
file:/opt/tomcat/tomcat-nutch/nutch-ccs/WEB-INF/classes/nutch-site.xml
050429 153000 crawl started in:
/opt/tomcat/tomcat-nutch/tomcat-nutch-5.0.19/bin/nutch-tmp/nutchcrawl-20050429153000
050429 153000 rootUrlFile =
/opt/tomcat/tomcat-nutch/nutch-ccs/WEB-INF/classes/seed.txt
050429 153000 threads = 10
050429 153000 depth = 10
050429 153000 Exceptions in crawl process:
java.net.UnknownHostException: prodfl04: prodfl04
java.lang.RuntimeException: java.net.UnknownHostException: prodfl04: prodfl04
at org.apache.nutch.io.SequenceFile$Writer.<init>(SequenceFile.java:67)
at org.apache.nutch.io.MapFile$Writer.<init>(MapFile.java:88)
at org.apache.nutch.db.WebDBWriter.<init>(WebDBWriter.java:1507)
at org.apache.nutch.db.WebDBWriter.createWebDB(WebDBWriter.java:1438)
at org.apache.nutch.tools.WebDBAdminTool.main(WebDBAdminTool.java:172)
at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:133)
at org.apache.nutch.web.CrawlJobAdapter.execute(CrawlJobAdapter.java:66)
at org.quartz.core.JobRunShell.run(JobRunShell.java:191)
at
org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:516)
Caused by: java.net.UnknownHostException: prodfl04: prodfl04
at java.net.InetAddress.getLocalHost(InetAddress.java:1191)
at org.apache.nutch.io.SequenceFile$Writer.<init>(SequenceFile.java:64)
... 8 more
------------------------------------------------------------
my linux box hostname is prodfl04.
And the code throws exception is here( SequenceFile.java )
-----------------------------------------------------------
private final byte[] sync; // 16 random bytes
{
try { // use hash of uid + host
MessageDigest digester = MessageDigest.getInstance("MD5");
digester.update((new UID()+"@"+InetAddress.getLocalHost()).getBytes());
sync = digester.digest();
} catch (Exception e) {
throw new RuntimeException(e);
}
}
--------------------------------------------------------------
Can someone explain why? even I run the application using root,
exception again and again. What should I care in linux box when
deploying nutch?
Regards
/Jack