It will be checking how long has been since the last fetch. So there will be a check which causes a natural delay.
But 7 minutes for 50 URLs might be too much, did you investigate which URLS are they? Could they be large PDF files or could your bandwidth be limited? Could you detect the bottleneck except for checking already seen URLs? ----- Orijinal Mesaj ----- Kimden: "Weder Carlos Vieira" <[email protected]> Kime: [email protected] Gönderilenler: 31 Temmuz Çarşamba 2013 19:26:55 Konu: Re: Revaluation I running this command below inside a linux script. bin/nutch generate -topN 50 bin/nutch fetch -all bin/nutch parse -all bin/nutch updatedb This takes 7 minutes to run... Tks On Wed, Jul 31, 2013 at 1:19 PM, Weder Carlos Vieira <[email protected] > wrote: > Hello > > Testing nutch today I could see that nutch is a little slow. This is > because it is reviewing the urls already reviewed? checking for updates? > > Anyone knows if I can change it? Change nutch to find out just news urls > to parse? > > > Thanks > Weder >

