Hi, thanks for your reply

Yes I was fetching from wikipedia only, I do this just for test this slowing
down effect. But not too much I think, 4pages/s, still gets slower and
slower, forever. So the fetcher is supposed to be slower than 1page/s (per
site) ?
I watched my bandwith, it used less than 20k/s, way less than my prodiver
feel easy.



Dennis Kubes-2 wrote:
> 
> If this is stalling on only a few fetching tasks check the logs, more 
> than likely it is fetching many pages from a single site (i.e. amazon, 
> wikipedia, cnn) and the politeness settings (which you want to keep) are 
> slowing it down.
> 
> If it is stalling on many task but a single machines check the hardware 
> for the machine.  We have seed hard disk speed decrease dramatically 
> right before they are going to die.  On linux do something like hdparm 
> -tT /dev/hda where hda is the device to check.  Average speeds for Sata 
> should be in the 75MBps range for disk reads and 7000+ range for cached 
> reads.
> 
> Another thing is you may be maxing your bandwidth and your provider is 
> throttling you?
> 
> Dennis KUbes
> 
> purpureleaf wrote:
>> Hi, I have worked with nutch for sometime. One thing I am always curious
>> is
>> when crawling, fetcher's speed will get slower and slower, no matter what
>> configuration I use.
>> My last test get this: ( just one site to make the problem more simple)
>> 
>> OS : winxp
>> java : 1.6.0.2
>> nutch: 0.9
>> cpu : AMD 1800
>> mem : 1G
>> network : 3m adsl
>> 
>> site : wikipedia.org
>> threads per site :30
>> server.delay : 0.5
>> 
>> It starts about 6page/s, but reduce to 4 in some minutes, then get slower
>> and slower. I have run it for 8 hours, just 2page/s left, and it was till
>> slowing down.
>> But if I stop it and start one other, it returns full speed (then slows
>> down
>> again). I am ok with 2 pages/s for one site, but I do hope it will keep
>> that
>> speed.
>> 
>> I found there are some guys in this list has the same problem. But I
>> can't
>> find an answer.
>> If nutch designed to work this way?
>> 
>> Thanks!
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Fetcher-get-slower-and-slower-in-one-run-of-crawling-tf4241580.html#a12073371
Sent from the Nutch - User mailing list archive at Nabble.com.

Reply via email to