Hi Michael
I sloved it.
Still the question: is InetAddress.getLocalHost() is the best choice in
digester.update((new UID()+"@"+InetAddress.getLocalHost()).getBytes());
??
/Jack
On 4/29/05, Michael Nebel <[EMAIL PROTECTED]> wrote:
> Hi Jack,
>
> for me this looks like a problem how the resolver-libary under linux
> resolves your hostnames. how is your network configured? can you try to
> use the fully quallified domainname of your server instead of just
> "prodfl04" (means something like "prodfl04.THIS.IS-THE-DOMAIN.COM").
> this should work. If not: Perhaps you can try this within a
> command-shell (using "# ping prodfl04")
>
> regards
>
> Michael
>
> Jack Tang schrieb:
>
> > Hi All
> >
> > When I migrate nutch from windows to linux, some errors come.
> > See the log below:
> > -------------------------------------------------------------------------------------
> > Apr 29, 2005 3:30:00 PM org.apache.nutch.web.CrawlJobAdapter execute
> > INFO: Job:CrawlJobs.CrawlJob executing @[Fri Apr 29 15:30:00 CST 2005]
> > 050429 153000 %nutch: -local
> > /opt/tomcat/tomcat-nutch/nutch-ccs/WEB-INF/classes/seed.txt -dir
> > /opt/tomcat/tomcat-nutch/tomcat-nutch-5.0.19/bin/nutch-tmp/nutchcrawl-20050429153000
> > -depth 10 -showThreadID
> > 050429 153000 parsing
> > file:/opt/tomcat/tomcat-nutch/nutch-ccs/WEB-INF/classes/nutch-default.xml
> > 050429 153000 parsing
> > file:/opt/tomcat/tomcat-nutch/nutch-ccs/WEB-INF/classes/crawl-tool.xml
> > 050429 153000 parsing
> > file:/opt/tomcat/tomcat-nutch/nutch-ccs/WEB-INF/classes/nutch-site.xml
> > 050429 153000 crawl started in:
> > /opt/tomcat/tomcat-nutch/tomcat-nutch-5.0.19/bin/nutch-tmp/nutchcrawl-20050429153000
> > 050429 153000 rootUrlFile =
> > /opt/tomcat/tomcat-nutch/nutch-ccs/WEB-INF/classes/seed.txt
> > 050429 153000 threads = 10
> > 050429 153000 depth = 10
> > 050429 153000 Exceptions in crawl process:
> > java.net.UnknownHostException: prodfl04: prodfl04
> > java.lang.RuntimeException: java.net.UnknownHostException: prodfl04:
> > prodfl04
> > at
> > org.apache.nutch.io.SequenceFile$Writer.<init>(SequenceFile.java:67)
> > at org.apache.nutch.io.MapFile$Writer.<init>(MapFile.java:88)
> > at org.apache.nutch.db.WebDBWriter.<init>(WebDBWriter.java:1507)
> > at
> > org.apache.nutch.db.WebDBWriter.createWebDB(WebDBWriter.java:1438)
> > at
> > org.apache.nutch.tools.WebDBAdminTool.main(WebDBAdminTool.java:172)
> > at org.apache.nutch.tools.CrawlTool.main(CrawlTool.java:133)
> > at
> > org.apache.nutch.web.CrawlJobAdapter.execute(CrawlJobAdapter.java:66)
> > at org.quartz.core.JobRunShell.run(JobRunShell.java:191)
> > at
> > org.quartz.simpl.SimpleThreadPool$WorkerThread.run(SimpleThreadPool.java:516)
> > Caused by: java.net.UnknownHostException: prodfl04: prodfl04
> > at java.net.InetAddress.getLocalHost(InetAddress.java:1191)
> > at
> > org.apache.nutch.io.SequenceFile$Writer.<init>(SequenceFile.java:64)
> > ... 8 more
> > ------------------------------------------------------------
> > my linux box hostname is prodfl04.
> > And the code throws exception is here( SequenceFile.java )
> >
> > -----------------------------------------------------------
> > private final byte[] sync; // 16 random bytes
> > {
> > try { // use hash of uid + host
> > MessageDigest digester = MessageDigest.getInstance("MD5");
> > digester.update((new
> > UID()+"@"+InetAddress.getLocalHost()).getBytes());
> > sync = digester.digest();
> > } catch (Exception e) {
> > throw new RuntimeException(e);
> > }
> > }
> > --------------------------------------------------------------
> > Can someone explain why? even I run the application using root,
> > exception again and again. What should I care in linux box when
> > deploying nutch?
> >
> > Regards
> > /Jack
>
> --
> Michael Nebel
> Internet: http://www.netluchs.de/
>
>