Sami Siren wrote:

I set DEBUG level loging and I've checked time during operations and when doint MapReduce job which is run after every page it takes 3-4 seconds till next url is fethed.
I have some local site and fetching 100 pages takes about 6 minutes.

You are fetching a single site yes? Then you can get more performance by tweaking the configuration
of fetcher.

<property>
 <name>fetcher.server.delay</name>
 <value></value>
 <description>The number of seconds the fetcher will delay between
  successive requests to the same server.</description>
</property>

<property>
 <name>fetcher.threads.per.host</name>
 <value></value>
 <description>This number is the maximum number of threads that
   should be allowed to access a host at one time.</description>
</property>

Hi,

I've manage to test nutch speed on several machines with different OS as well.
I looks that fetcher.threads.per.host makes fetcher run faster.

What I still don't understand is this.

When fetcher threads was set to default value fetcher was doing mapreduce after every url.
But now job is run on about 400 urls or maybe more.

--
Uros
--
Sami Siren

Reply via email to