You can filter some unnecessary "tail" using UrlFilter; for instance, some 
sites may have long forums which you don't need, or shopping cart / process to 
checkout pages which they forgot to restrict via robots.txt...

Check regex-urlfilter.txt.template in /conf


Another parameter which equalizes 'per-site' URLs is 
db.max.outlinks.per.page=100 (some sites may have 10 links per page, others - 
1000...)


-Fuad
http://www.linkedin.com/in/liferay
http://www.tokenizer.org



-----Original Message-----
From: MilleBii [mailto:[email protected]] 
Sent: August-25-09 5:48 PM
To: [email protected]
Subject: Limiting number of URL from the same site in a fetch cycle

I'm wondering if there is a setting by which you can limit the number of
urls per site on a fetch list, not a on a total site.
In this way I could avoid long tails in a fetch list all from the same site
so it takes damn long (5s per URL), I'd like to fetch them on the next
cycle.

-- 
-MilleBii-


Reply via email to