hi to all i am working with nutch-0.8.1
centos I was working with google, which allows me to make a crawl continuous, of this form no longer tapeworm that to make complete, single a crawl updated my index when some site has had some change single towards a complete crawl at the beginning, but already after was continuous. my question is, with nutch is possible to make a type of continuous crawl? i am trying index 7 sites but the time is a longer 3 days for this [EMAIL PROTECTED] nutch-0.8]# ./bin/nutch readdb crawl2/crawldb -stats CrawlDb statistics start: crawl2/crawldb Statistics for CrawlDb: crawl2/crawldb TOTAL urls: 286272 retry 0: 284788 retry 1: 856 retry 2: 628 min score: 0.0 avg score: 5.5150344E-5 max score: 1.396 status 1 (DB_unfetched): 23 status 2 (DB_fetched): 284463 status 3 (DB_gone): 1786 CrawlDb statistics: done i am trying implement nutch and hadoop for reduce time any idea for helme? thanks in advance -- View this message in context: http://www.nabble.com/recrawl-continuos-tp16095581p16095581.html Sent from the Nutch - User mailing list archive at Nabble.com.
