Hi, guys,

my goal is to do by crawls at 100 fetches per second, observing, of course,
polite crawling. But, when URLs are all different domains, what
theoretically would stop some software from downloading from 100 domains at
once, achieving the desired speed?

But, whatever I do, I can't make Nutch crawl at that speed. Even if it
starts at a few dozen URLs/second, it slows down at the end (as discussed by
many and by Krugler).

Should I write something of my own, or are their fast crawlers?

Thanks!

Mark

Reply via email to