You should use the adaptative fetch schedule. See http://pascaldimassimo.com/2010/06/11/how-to-re-crawl-with-nutch/ <http://pascaldimassimo.com/2010/06/11/how-to-re-crawl-with-nutch/%20>for details
On 1 June 2011 07:18, <[email protected]> wrote: > Hello, > > I use nutch-1.2 to index about 3000 sites. One of them has about 1500 pdf > files which do not change over time. > I wondered if there is a way of configuring nutch not to fetch unchanged > documents again and again, but keep the old index for them. > > > Thanks. > Alex. > -- * *Open Source Solutions for Text Engineering http://digitalpebble.blogspot.com/ http://www.digitalpebble.com

