Sorry, Nutch is certainly aware of page modification, and it does capture
lastModified. The real question is, can nutch get lastModified of a page
before fetching, and use it to make fetching decisions (e.g,, whether or
not to override the default interval)?


On Fri, Jun 21, 2013 at 6:27 PM, Joe Zhang <[email protected]> wrote:

> If I don't change the default value of db.fetch.interval.default, which is
> 30 days, does it mean that the URL in the db won't be refetched before the
> due time even if it has been modified? In other words, is Nutch aware of
> page modification?
>

Reply via email to