Just out of curiousity, does anyone here know how well query caching works in general with an extremely high-volume search engine?
It seems like as your search volume goes up, and the number of unique queries goes up with it, the cache hit rate would go down, and caching would help less and less. Urs Hoelzle (Google) mentioned this in a talk he gave at UW in 2002: http://rakaposhi.eas.asu.edu/f02-cse494-mailarchive/msg00138.html (link to video on this page) -chris On 2/7/06, Byron Miller <[EMAIL PROTECTED]> wrote: > I use OSCache with great success. > > I would an amazing amount (more then i assumed) of > queries we get are duplicate of one fashion or another > so on top of warming things up as much as possible to > the OS buffer cache we use OSCache as well. > > You could also use Squid to cache pages for x amount > of time to offload your hotspots to free up cpu time > for those ad-hoc/random queries. (as long as you > aren't forcing content expire in your headers) > > -byron > > > --- "Insurance Squared Inc." > <[EMAIL PROTECTED]> wrote: > > > Hi, > > > > Running nutch 0.71on Mandrake linux 2006 (P4 with a > > 2 sata drives on > > raid 0, 2 gigs of ram, about 4 million pages, but > > expecting to hit 10+), > > and finding that our initial queries take up to > > 15-20 seconds to return > > results. I'd like to get that speeded up and am > > seeking thoughts on how > > to do so. > > > ------------------------------------------------------- This SF.net email is sponsored by: Splunk Inc. Do you grep through log files for problems? Stop! Download the new AJAX search engine that makes searching your log files as easy as surfing the web. DOWNLOAD SPLUNK! http://sel.as-us.falkag.net/sel?cmd=lnk&kid3432&bid#0486&dat1642 _______________________________________________ Nutch-general mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/nutch-general
