|
I have been trying Nutch for a while, but it seems
that I am encountering some problems with the indexing speed in the later
versions. When I try to crawl about 300 sites (to depth 4, for
instance), the initial speed is about 2 pages per second (I have a connection of
about 600kbps), but when new segments are being generated, that speed becames as
low as only 0.4 pages per second. I use the default Nutch
configuration and the same thing hapens when I try the whole web indexing method
(using the same 300 sites). I don't have records, but I recall that in the
earlier versions of Nutch, the indexing speed did not decrease at all, or
at least, not to that proportion.
Am I missing some thing?
Thanks
|
No virus found in this outgoing message. Checked by AVG Anti-Virus. Version: 7.0.344 / Virus Database: 267.11.9/116 - Release Date: 30/9/2005
