On Mon, 2011-07-04 at 13:51 +0200, Jame Vaalet wrote:
> What would be the maximum size of a single SOLR index file for resulting in 
> optimum search time ?

There is no clear answer. It depends on the number of (unique) terms,
number of documents, bytes on storage, storage speed, query complexity,
faceting, number of concurrent users and a lot of other factors.

> In case I have got to index all the documents in my repository  (which is in 
> TB size) what would be the ideal architecture to follow , distributed SOLR ?

A TB in source documents might very well end up as a simple, single
machine index of 100GB or less. It depends on the amount of search
relevant information in the documents, rather that their size in bytes.

If your sources are Word-documents or a similar format with a relatively
large amount of stuffing and your searches are mostly simple "the user
enters 2-5 verbs and hits enter", my guess is that you don't need to
worry about distribution yet.

Make a pilot. Most of the work you'll have to do for a single machine
test can be reused for a distributed production setup.

Reply via email to