It depends on many factors - how big those docs are (compare a tweet to a news 
article to a book chapter) whether you store the data or just index it, whether 
you compress it, how and how much you analyze the data, etc.

Otis
----
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Hadoop ecosystem search :: http://search-hadoop.com/



----- Original Message ----
> From: Jean-Sebastien Vachon <js.vac...@videotron.ca>
> To: solr-user@lucene.apache.org
> Sent: Wed, February 24, 2010 8:57:21 AM
> Subject: Index size
> 
> Hi All,
> 
> I'm currently looking on integrating Solr and I'd like to have some hints on 
> the 
> size of the index (number of documents) I could possibly host on a server 
> running a Double-Quad server (16 cores) with 48Gb of RAM running Linux. 
> Basically, I need to determine how many of these servers would be required to 
> host about half a billion documents. Should I setup multiple Solr instances 
> (in 
> Virtual Machines or not) or should I run a single instance (with multicores 
> or 
> not) using all available memory as the cache ?
> 
> I also made some tests with shardings on this same server and I could not see 
> any improvement (at least not with 4.5 millions documents). Should all the 
> shards be hosted on different servers? I shall try with more documents in the 
> following days.
> 
> Thx 

Reply via email to