I was never doing it this way - creating webdb content based on segments only. So I do not know if it works - I do not have time at the moment to test it myslef - sorry.
Regards
Piotr

EM wrote:
The problem is still there, maybe I'm doing something wrong?

1. 'rm -r db' 2. 'mkdir db'
3. ' bin/nutch admin db -create'
4. I'll then updatedb db from a fetched segment, this should fill it up with
links?
5. 'bin/nutch analylze db 7'
And it fails here with three 'tmp<something>' directories and webdb.new


-----Original Message-----
From: Piotr Kosiorowski [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 30, 2005 3:07 PM
To: [email protected]
Subject: Re: Analyser error

It looks like you have temporary results from previous run (probably killed or terminated not successfully). It shoudl be safe to remove db\webdb.new directory and start again.
regars
Piotr
EM wrote:

What does it mean if the bin/nutch analyze db 7 fails with:


050830 024914 Target pages from init(): 27419
050830 024914 Processing pagesByURL: Sorted 27419 instructions in 0.172
seconds.
050830 024914 Processing pagesByURL: Sorted 159412.79069767444
instructions/second
Finished at Tue Aug 30 02:49:14 EDT 2005
Exception in thread "main" java.io.IOException: already exists:
db\webdb.new\pagesByURL
       at org.apache.nutch.io.MapFile$Writer.<init>(MapFile.java:86)
       at


org.apache.nutch.db.WebDBWriter$CloseProcessor.closeDown(WebDBWriter.java:54

9)
       at org.apache.nutch.db.WebDBWriter.close(WebDBWriter.java:1544)
       at


org.apache.nutch.tools.DistributedAnalysisTool.completeRound(DistributedAnal

ysisTool.java:562)
       at
org.apache.nutch.tools.LinkAnalysisTool.iterate(LinkAnalysisTool.java:60)
       at
org.apache.nutch.tools.LinkAnalysisTool.main(LinkAnalysisTool.java:81)








Reply via email to