Dan, You may want to ask on Solr, Lucene, or Nutch lists. However, I can tell you already that these numbers look a little...overly optimistic :)
Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- > From: Dan Segel <[EMAIL PROTECTED]> > To: [email protected] > Sent: Thursday, June 5, 2008 9:12:31 AM > Subject: Gigablast.com search engine, 10billion pages!!! > > Our ultimate goal is to basically replicate gigablast.com search engine. > They claim to have less than 500 servers that contain 10billion pages > indexed, spidered and updated on a routine basis... I am looking at > featuring 500 million pages indexed per node, and have a total of 20 nodes. > Each node will feature 2 quad core processes, 4TB (at raid 5) and 32 gb of > ram. I believe this can be done however how many searches per second do you > think would be realistic in this instance? We are looking at achieving > 25+/- searches per second ultimately spread out over the 20 nodes... I can > really uses some advice with this one. > > Thanks, > D. Segel
