It looks like you don't have enough RAM to maintain the quick speeds you were 
seeing when the index was only around 3000 pages.
 
Nutch scales very well, but the hardware behind it must also. Using quick 
calculations and common sense, if your total system RAM is only 512MB and all 
of that is given to tomcat alone your looking at a situation where other system 
applications and/or parts of Tomcat are being executed out of swap memory. This 
will kill search speed.
 
My recommendation would be to get more RAM, another 512MB should support a 1.5 
million page index running at the speeds you experienced during your 3000 page 
trials. If you can get even more, then your only helping system (search) 
performance.

Here are a few other tips, just in case you cant get any more RAM at this time:
 
1. Make sure your passing "-server" via JAVA_OPTS.
2. Disable all non-required system and user applications.
3. Download or install the newest stable kernel and recompile without all the 
junk.
4. Reduce the size of your index.

 
----- Original Message ----
From: shrinivas patwardhan <[EMAIL PROTECTED]>
To: [email protected]
Sent: Friday, December 29, 2006 4:45:41 AM
Subject: Re: search performance


thank you Sean Dean for your quick reply ...
well i am running nutch on ubuntu 5.01 and jdk1.5
there are some apps running in the background but they dont take up that
much of memory .
secondly i can understand about the first search .. but the other searches
following it also take time even getting the next 10 pages also takes some
time ..
so looking at all the issues does it relate to my system on the whole .. or
have i got wrong some where in the indexing process ?
i just followed the tutorial  for  nutch -0.7.2   under the section whole
web crawling .
when i indexed just about 3000 pages (subset of that dmoz index) the search
results were quick ) but now after loading the index file for almost
1.5million pages it really dies up
i use to get a java heap space error in tomcat ,so i fixed it by setting the

JAVA_OPTS  to Xmx512m
i guess i have made my self very clear now . so wht do guys think must be
wrong ?

Thanks
Shrinivas
-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys - and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
Nutch-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/nutch-general

Reply via email to