OK, so you aren't going to get it into memory unless you spend a lot on servers. We haven't found memory (or disk access) to be a limiting factor anyway -- CPU is the issue. I'm not sure what you want to spend, but a single server with SATA RAID, 4GB RAM and the latest AMD processor will search your collection in ~10-20 seconds, depending on the complexity of the search. If you need faster performance or the ability to support many hits at once, you are going to have to parallelize the configuration across multiple servers using ParallelMultiSearcher.
Keep in mind that Lucene isn't really set up to handle parallel searching robustly. There is a lot of code you are going to have to write for an enterprise-ready solution (e.g., checking the status of a given server to make sure it isn't down, redundantly storing indexes so that the search still functions if one server is down, potentially handling laggards to increase speed, etc.). We have done some of this, and have more to do -- it is a very non-trivial task. Sincerely, James Ryley, Ph.D. > -----Original Message----- > From: caribou_surf [mailto:[EMAIL PROTECTED] > Sent: Monday, August 28, 2006 10:42 AM > To: [email protected] > Subject: RE: Kind of hardware config ? > > > About 100 Giga > > > > James-10 wrote: > > > > What's the total document size? > > > > Sincerely, > > James Ryley, Ph.D. > > > >> -----Original Message----- > >> From: caribou_surf [mailto:[EMAIL PROTECTED] > >> Sent: Monday, August 28, 2006 5:01 AM > >> To: [email protected] > >> Subject: Kind of hardware config ? > >> > >> > >> We want to index about 2 millions of html documents with Lucune. > >> Have you an idea of the machine configuration the most adapted (bi > proc, > >> 2 > >> Go on memrory, raid disks...) ? > >> -- > >> View this message in context: http://www.nabble.com/Kind-of-hardware- > >> config---tf2176085.html#a6016661 > >> Sent from the Lucene - General forum at Nabble.com. > > > > > > > > -- > View this message in context: http://www.nabble.com/Kind-of-hardware- > config---tf2176085.html#a6021457 > Sent from the Lucene - General forum at Nabble.com.
