You should use the adaptative fetch schedule. See
http://pascaldimassimo.com/2010/06/11/how-to-re-crawl-with-nutch/
<http://pascaldimassimo.com/2010/06/11/how-to-re-crawl-with-nutch/%20>for
details

On 1 June 2011 07:18, <[email protected]> wrote:

> Hello,
>
> I use nutch-1.2 to index about 3000 sites. One of them has about 1500 pdf
> files which do not change over time.
> I wondered if there is a way of configuring nutch not to fetch unchanged
> documents again and again, but keep the old index for them.
>
>
> Thanks.
> Alex.
>



-- 
*
*Open Source Solutions for Text Engineering

http://digitalpebble.blogspot.com/
http://www.digitalpebble.com

Reply via email to