Hi Mark,

I've recently contributed 2 patches on JIRA (NUTCH-769 / NUTCH-770) which
will have an impact on the speed of the crawling. This should help with the
fetch rate slowing down.
There is also https://issues.apache.org/jira/browse/NUTCH-753 which should
help to a lesser extent.

Julien

-- 
DigitalPebble Ltd
http://www.digitalpebble.com

2009/11/24 Mark Kerzner <markkerz...@gmail.com>

> Hi, guys,
>
> my goal is to do by crawls at 100 fetches per second, observing, of course,
> polite crawling. But, when URLs are all different domains, what
> theoretically would stop some software from downloading from 100 domains at
> once, achieving the desired speed?
>
> But, whatever I do, I can't make Nutch crawl at that speed. Even if it
> starts at a few dozen URLs/second, it slows down at the end (as discussed
> by
> many and by Krugler).
>
> Should I write something of my own, or are their fast crawlers?
>
> Thanks!
>
> Mark
>

Reply via email to