I have the same problem with 0.7.2. 

My guess is that the updatedb isn't actually adding more links into the webdb.

i ran the bin/nutch crawl with a depth of 1 and it grabbed the initial page and
registered the "to" links in the webdb.

Then I run the recrawl script: 
http://today.java.net/pub/a/today/2006/02/16/introduction-to-nutch-2.html many
times and then do a

bin/nutch readdb crawldb/db -dumplinks

It doesn't have any more URL's beyond the initial set. I thought that updatedb
in the recrawl script was supposed to inject new URL's into the webdb to index
next time around.





_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to