Thanks for finding this bug, please open a bug report in jira and if you like I guess patches are always welcome. :-)

Am 23.01.2006 um 15:00 schrieb [EMAIL PROTECTED]:

Hi,

Protocol-httpclient sets the maximum number of total connections to
"fetcher.threads.fetch" configuration parameter for underlying
commons-httpclient. However, if -threads argument is used with the fetcher it doesn't change fetcher.threads.fetch. Giving whatever number of threads to -threads argument, httpclient will use default value of number of total connections (10). This will affect the performance of crawling. It seems to
be a bug. Any comment on this?

Possible solution can be adding below line to setThreadCount function of
Fetcher class.
 NutchConf.get().setInt("fetcher.threads.fetch", threadCount);

Also, fetcher seems to be using lots of memory; maybe due to memory leak. It starts with %10~%15; after several hours Linux top command reports it's using
%50~%70 of the whole memory. Anyone experiencing this behaviour?

Thanks,
-orkunt.


---------------------------------------------------------------
company:        http://www.media-style.com
forum:        http://www.text-mining.org
blog:            http://www.find23.net


Reply via email to