https://issues.apache.org/jira/browse/NUTCH-578
https://issues.apache.org/jira/browse/NUTCH-1245

Is you issue similar to these?

On Tuesday 28 February 2012 14:09:25 Mathijs Homminga wrote:
> Hi,
> 
> Does anyone know why the AbstractFetchSchedule.forceFetch method sets the
> page.status to STATUS_UNFETCHED?
> 
> The DbUpdateReducer calls this method when the page.fetchInterval exceeds
> the (current) db.fetch.interval.max. As I understand it, we call this
> method to keep all fetchIntervals in the webtable within the current
> maximum, but why reset the page status?
> 
> I bumped into this because my db.fetch.interval.default >
> db.fetch.interval.max ;)) After a couple of successful crawl cycles, all
> of my webpages still were STATUS_UNFETCHED.
> 
> Cheers,
> Mathijs

-- 
Markus Jelsma - CTO - Openindex

Reply via email to