Increase number of threads when fetching
Also please see nutch-deault.xml for paritioning of urls, if you know your
target domains you may wish to adapt the policy.
Lewis
On Sunday, January 27, 2013, peterbarretto peterbarrett...@gmail.com
wrote:
I want to increase the number of urls fetched at
I tried increasing the numbers of threads to 50 but the speed is not affected
I tried changing the partition.url.mode value to byDomain and
fetcher.queue.mode to byDomain but still it does not help the speed.
It seems to get urls from 2 domains now and the other domains are not
getting crawled.
Hey Peter,
I am guessing that you have just increased the global thread count. Have
you even increased fetcher.threads.per.host ? This will improve the crawl
rate as multiple threads can attack the same site. Dont make it too high or
else the system will get overloaded. The nutch wiki has an
3 matches
Mail list logo