I need to know a few things about how to manage Nutch crawl? 1. I have done a full crawl in which all possible intranet sites have been discovered and indexed. Now I don't want to lose this index and update the same index by recrawling over these sites once again. So, if any page has changed, the content for that URL in the index should be updated. Is this possible?
2. During the recrawl on the same set of websites, if it finds links to new pages (to same website or other websites) which is not present in the index currently, they should also be fetched and inserted to the index. Is this possible?
