Hi all,
I just posted some performance figures from a test crawl I did using
an alternative queue-based fetcher (Bixo) at:
http://ken-blog.krugler.org/2009/05/19/performance-problems-with-verticalfocused-web-crawling/
From this data, and my experience using Nutch for vertical crawls
previously, I keep wondering if some of the difference in performance
from the original fetcher to Fetcher2 is due to bugs (basically
impolite fetching) with the old fetcher.
Has anybody done any testing with the old fetcher to verify that it's
acting politely, especially near the end of a crawl?
-- Ken
--
Ken Krugler
+1 530-210-6378