Hi, Can some one please explain how the following scenario works?
I need to crawl a site with 50K urls. This site is a dynamic site and will have frequent updates on the site. Assuming it takes 2 days to completely crawl this site, can we have some configuration(fetch schedule or something else) so that once the crawl cycle is complete, the next crawl cycle will start automatically after two days to find the new URLS. If this feature is not available, should we manually control the repeated crawling of the site thru some sort of scripting? Actually we will have to crawl more than 50 sites to be crawled separately. If we need to maintain re-crawling of each site, should we have 50 separate scripts to handle them. Please let us know if anyone has faced this situation? Thanks, Senthil -- View this message in context: http://lucene.472066.n3.nabble.com/Whether-Nutch-AdaptiveFetchSchedule-can-do-recrawling-automatically-tp4056979p4057036.html Sent from the Nutch - User mailing list archive at Nabble.com.

