Went through the checklist and made some changes as in increased the no
of fetcher threads from default 10 to 30, but I still see nutch eating
up all the resources, the CPU usage is as high as 100%
-Bharat
On Tuesday 21 February 2012 04:45 PM, Julien Nioche wrote:
See
Try decreasing the number of fetcher threads instead...
On Wed, Feb 22, 2012 at 2:33 PM, Bharat Goyal bharat.go...@shiksha.comwrote:
Went through the checklist and made some changes as in increased the no
of fetcher threads from default 10 to 30, but I still see nutch eating
up all the
Hi,
I have a list of around 1000 seed URLS, which I crawl till depth=2 or 3.
This is done on a local machine having a configuration(having no other
large resource consuming processes running) :
Dual Core (2.4 GHz),
4GB Ram
It takes around 14-15 hours to crawl this seedlist, which generates
See http://*wiki*.apache.org/*nutch*/OptimizingCrawls for a checklist
On 21 February 2012 10:47, Bharat Goyal bharat.go...@shiksha.com wrote:
No of fetcher threads is equal to default value(10), What is the optimum
value for no of threads? Also, the fetching and parsing are not seperate.
4 matches
Mail list logo