Dan, you left out one important "bit" - this is a 64-bit machine?
Sean, out of curiosity... is this really better than running a single JVM on a multi-core 64-bit machine with 32GB of RAM than running a single JVM instance, single Nutch instance, and letting the OS switch between cores? As for fetching/indexing/searching - you probably don't want to do this on the same set of machines. Use a set of machines for fetching/indexing, and a set of machines for serving search requests. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch ----- Original Message ---- > From: Sean Dean <[EMAIL PROTECTED]> > To: [email protected] > Sent: Thursday, June 5, 2008 3:45:41 PM > Subject: Re: Hardware Specifications > > Another idea is to setup 8 seperate nutch instances on the same server, each > with its own 20M index. > > The idea behind this is that one-core per application will be used, although > its > not pegged and the RAM is used in ~4GB chunks (JVM setting) for each instance. > > This would be used for serving results only though, you would have to disable > part or all of this when in fetching mode but it would give you 160M pages > and > still very good speeds (about 4-5 per second or more as other factors come > into > play). Keep in mind we use 8 hard drives, each associated with its own > instance > on the server but as long as the RAID FC setup you have is very fast the > results > should be comparible (maybe even faster). > > > ----- Original Message ---- > From: Dennis Kubes > To: [email protected] > Sent: Thursday, June 5, 2008 2:38:04 PM > Subject: Re: Hardware Specifications > > In memory index 15M. On disk index, slower but still doable where > response time isn't critical, ~350M pages maybe more. > > Dennis > > Dan Segel wrote: > > We have a server that has 30TB of hard drive space connected through fiber, > > 2 quad core 2.5ghz, and 32gb of ram. If fetching 5 searches per second how > > many million indexed pages do you think we can achieve? > >
