https://issues.apache.org/jira/browse/NUTCH-578 https://issues.apache.org/jira/browse/NUTCH-1245
Is you issue similar to these? On Tuesday 28 February 2012 14:09:25 Mathijs Homminga wrote: > Hi, > > Does anyone know why the AbstractFetchSchedule.forceFetch method sets the > page.status to STATUS_UNFETCHED? > > The DbUpdateReducer calls this method when the page.fetchInterval exceeds > the (current) db.fetch.interval.max. As I understand it, we call this > method to keep all fetchIntervals in the webtable within the current > maximum, but why reset the page status? > > I bumped into this because my db.fetch.interval.default > > db.fetch.interval.max ;)) After a couple of successful crawl cycles, all > of my webpages still were STATUS_UNFETCHED. > > Cheers, > Mathijs -- Markus Jelsma - CTO - Openindex

