I uses current trunk. Example of statistics before recrawl loop:
status 1 (db_unfetched):        56968
status 2 (db_fetched):  6170
status 3 (db_gone):     818
status 4 (db_redir_temp):       299
status 5 (db_redir_perm):       270
status 6 (db_notmodified):      18

I set those parameters:
depth=5
topn=5000
adddays=30

It do 5 loops and I see this statistic:
status 1 (db_unfetched):        57255
status 2 (db_fetched):  6326
status 3 (db_gone):     837
status 4 (db_redir_temp):       313
status 5 (db_redir_perm):       276
status 6 (db_notmodified):      18

LESS THAN 200 urls were fetched in comparison to previous statistic
(db_fetched value)? Where is the problem?
I think it must fetch about 25000 urls.
-- 
View this message in context: 
http://www.nabble.com/nutch-fetches-already-fetched-urls-again-and-again-tp22226407p22227636.html
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to