Normally only the first query takes that long.
Do you plan often to reboot the search server?
If you do that by a script you can add something like wget ...? query=http Cache makes somehow sense but only if you have multiple search servers and many repeated identically queries.

Am 06.02.2006 um 21:33 schrieb Insurance Squared Inc.:

Hi,

Running nutch 0.71on Mandrake linux 2006 (P4 with a 2 sata drives on raid 0, 2 gigs of ram, about 4 million pages, but expecting to hit 10+), and finding that our initial queries take up to 15-20 seconds to return results. I'd like to get that speeded up and am seeking thoughts on how to do so.

My initial thoughts are that I need to do something with caching somehow. Byron M had commented a while back on this list:
--
I would warm up your index by throwing
queries at it to get the blocks cached on an OS level
or work on implementing RAMDirectory instead of
FSDirectory to store your index in ram if you have the
resources to do so.
----
Does this seem the best way to ensure my users are getting fast search results? If so, does someone have a list of queries that I might try using? I suppose I could use the list of 'one day's worth of search terms' that I found here:
http://blog.sli-systems.com/2006/02/what_would_the_search_engines.html
However, I'm not sure how to even measure the effectiveness of that - if I ran the say 500K search terms in there past the software, perhaps there isn't enough cache space for that amount of searches......

Any suggestions on the above, or comments on best way to cover this long lead time on search results?

Thanks.





---------------------------------------------------------------
company:        http://www.media-style.com
forum:        http://www.text-mining.org
blog:            http://www.find23.net


Reply via email to