Hi Everybody, We are building a search infrastructure using lucene to scale upto 500 million document with search < 500 ms.
Here is my rough math on the size of content & index : Total Documents = 500 million documents Size / Document = 10k / document Index Size / Million = 2 GB / million document Total Index size = 500 million ~ 1 TB We are planning to partition this 1 TB index into 25 partitions with each partition of around 20 million documents @ 40 GB size. Since 1TB doesn't seem to be that much, we are debating whether we should go for RAM memory for the whole 1 TB. Checked the prices for RAM memory ( 64 GB / 8 CPU boxes ) and they are very competitive. Now the question is.. Can we use RAM Directory for all of this 1 TB or FSDirectory is better with separate spindle for each CPU ? We are considering 25 boxes ( 8 CPU - 64 GB boxes ) for each partition and separate brokers to merge these results. Did anybody did something like this in the past ? Appreciate if you guys can share your experiences. thanks Murali V -- View this message in context: http://www.nabble.com/Scaling-Lucene-to-500-million-documents---preferred-architecture-tf4038794.html#a11474442 Sent from the Lucene - Java Developer mailing list archive at Nabble.com. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]