Hi, thanks for your reply Yes I was fetching from wikipedia only, I do this just for test this slowing down effect. But not too much I think, 4pages/s, still gets slower and slower, forever. So the fetcher is supposed to be slower than 1page/s (per site) ? I watched my bandwith, it used less than 20k/s, way less than my prodiver feel easy.
Dennis Kubes-2 wrote: > > If this is stalling on only a few fetching tasks check the logs, more > than likely it is fetching many pages from a single site (i.e. amazon, > wikipedia, cnn) and the politeness settings (which you want to keep) are > slowing it down. > > If it is stalling on many task but a single machines check the hardware > for the machine. We have seed hard disk speed decrease dramatically > right before they are going to die. On linux do something like hdparm > -tT /dev/hda where hda is the device to check. Average speeds for Sata > should be in the 75MBps range for disk reads and 7000+ range for cached > reads. > > Another thing is you may be maxing your bandwidth and your provider is > throttling you? > > Dennis KUbes > > purpureleaf wrote: >> Hi, I have worked with nutch for sometime. One thing I am always curious >> is >> when crawling, fetcher's speed will get slower and slower, no matter what >> configuration I use. >> My last test get this: ( just one site to make the problem more simple) >> >> OS : winxp >> java : 1.6.0.2 >> nutch: 0.9 >> cpu : AMD 1800 >> mem : 1G >> network : 3m adsl >> >> site : wikipedia.org >> threads per site :30 >> server.delay : 0.5 >> >> It starts about 6page/s, but reduce to 4 in some minutes, then get slower >> and slower. I have run it for 8 hours, just 2page/s left, and it was till >> slowing down. >> But if I stop it and start one other, it returns full speed (then slows >> down >> again). I am ok with 2 pages/s for one site, but I do hope it will keep >> that >> speed. >> >> I found there are some guys in this list has the same problem. But I >> can't >> find an answer. >> If nutch designed to work this way? >> >> Thanks! > > -- View this message in context: http://www.nabble.com/Fetcher-get-slower-and-slower-in-one-run-of-crawling-tf4241580.html#a12073371 Sent from the Nutch - User mailing list archive at Nabble.com.
