Sorry, Nutch is certainly aware of page modification, and it does capture lastModified. The real question is, can nutch get lastModified of a page before fetching, and use it to make fetching decisions (e.g,, whether or not to override the default interval)?
On Fri, Jun 21, 2013 at 6:27 PM, Joe Zhang <[email protected]> wrote: > If I don't change the default value of db.fetch.interval.default, which is > 30 days, does it mean that the URL in the db won't be refetched before the > due time even if it has been modified? In other words, is Nutch aware of > page modification? >

