[Nutch-general] Number of links in WebDB

Yousef Ourabi Sun, 08 Jan 2006 00:41:28 -0800

Hello,

There was a post about a week ago that was about finding out how many URL's
had been crawled.


bin/nutch readdb -local db -stats

I used a dmoz dump to bootstrap the system.  I am assuming this doesn't
update during the crawl because of these two lock files (dbreadlock and
dbwritelock) -- I guess my questions are:

1) when does the link db get updated (since in theory outbound links are
found on many of the pages)
2) what's the 0.8 version of org.apache.nutch.db.WebDBReader  -- is it
crawl.CrawlDbReader?

Thanks a ton.

Yousef

[Nutch-general] Number of links in WebDB

Reply via email to