I uses current trunk. Example of statistics before recrawl loop: status 1 (db_unfetched): 56968 status 2 (db_fetched): 6170 status 3 (db_gone): 818 status 4 (db_redir_temp): 299 status 5 (db_redir_perm): 270 status 6 (db_notmodified): 18
I set those parameters: depth=5 topn=5000 adddays=30 It do 5 loops and I see this statistic: status 1 (db_unfetched): 57255 status 2 (db_fetched): 6326 status 3 (db_gone): 837 status 4 (db_redir_temp): 313 status 5 (db_redir_perm): 276 status 6 (db_notmodified): 18 LESS THAN 200 urls were fetched in comparison to previous statistic (db_fetched value)? Where is the problem? I think it must fetch about 25000 urls. -- View this message in context: http://www.nabble.com/nutch-fetches-already-fetched-urls-again-and-again-tp22226407p22227636.html Sent from the Nutch - User mailing list archive at Nabble.com.
