Re: Fetch Retries

Markus Jelsma Fri, 23 Dec 2011 06:48:09 -0800

then those url's are due for fetch because of their fetch time and fetch 
interval.  it depends on what fetch scheduler you are using. If adaptive then 
it's likely to run out of control if all pages always return a new signature 
and are never not_modified.


On Friday 23 December 2011 15:33:55 Bai Shen wrote:
> So it's normal to get that many retries?  I thought the db.fetch.retry.max
> value was the limit on that?
> 
> My main concern is that it seems to keep attempting the same set of urls
> because those are the highest scoring.  So it's not fetching that much new
> content.
> 
> On Fri, Dec 23, 2011 at 5:06 AM, Markus Jelsma
> 
> <[email protected]>wrote:
> > This is normal and done in your fetch scheduler. Check the fetch
> > schedule's code to see exactly when and why it is incremented.
> > 
> > On Thursday 22 December 2011 19:39:02 Bai Shen wrote:
> > > I'm using the default db.fetch.retry.max value of 3, but I'm seeing
> > > retry counts as high as 14 in the crawldb stats output.  Any ideas why
> > > this is and how to change it?
> > 
> > --
> > Markus Jelsma - CTO - Openindex

-- 
Markus Jelsma - CTO - Openindex

Re: Fetch Retries

Reply via email to