db_unfetched large number, but crawling not fetching any longer

webdev1977 Fri, 23 Mar 2012 06:46:48 -0700

I was under the impression that setting topN for crawl cycles would limit the
number of items each iteration of the crawl would fetch/parse.  However,
eventually after continuously running crawl cycles it would get ALL the
urls.  My continuous crawl has stopped fetching/parsing and the stats from
crawldb indicate that db_unfetched is 133,359.


Why is it no longer fetching urls if there are so many unfetched?

--
View this message in context: 
http://lucene.472066.n3.nabble.com/db-unfetched-large-number-but-crawling-not-fetching-any-longer-tp3851587p3851587.html
Sent from the Nutch - User mailing list archive at Nabble.com.

db_unfetched large number, but crawling not fetching any longer

Reply via email to