Hi Joe,
In 1.x Markus and Julien IIRC committed a real nice patch a while back
which allows you to achieve what I think you are after.
Please look at this thread
http://www.mail-archive.com/[email protected]/msg08738.html
You will find piles of stuff on the user archive about this kinda granular
stuff.
ta, have a gd wkend.

On Friday, June 21, 2013, Joe Zhang <[email protected]> wrote:
> Sorry, Nutch is certainly aware of page modification, and it does capture
> lastModified. The real question is, can nutch get lastModified of a page
> before fetching, and use it to make fetching decisions (e.g,, whether or
> not to override the default interval)?
>
>
> On Fri, Jun 21, 2013 at 6:27 PM, Joe Zhang <[email protected]> wrote:
>
>> If I don't change the default value of db.fetch.interval.default, which
is
>> 30 days, does it mean that the URL in the db won't be refetched before
the
>> due time even if it has been modified? In other words, is Nutch aware of
>> page modification?
>>
>

-- 
*Lewis*

Reply via email to