Hi chris, 

I'm responsible for the webshots.com search index and we've had very
good results with lucene. It currently indexes over 100 Million
documents and performs 4 Million searches / day. 

We initially tested running multiple small copies and using a
MultiSearcher and then merging results as compared to running a very
large single index. We actually found that the single large instance
performed better. To improve load handling we clustered multiple
identical copies together, then session bind a user to particular server
and cache the results, but each server is running a single index. 

Bryan McCormick

On Fri, 2005-02-18 at 08:01, Chris D wrote: 
> Hi all, 
> I have a question about scaling lucene across a cluster, and good ways
> of breaking up the work.
> We have a very large index and searches sometimes take more time than
> they're allowed. What we have been doing is during indexing we index
> into 256 seperate indexes (depending on the md5sum) then distribute
> the indexes to the search machines. So if a machine has 128 indexes it
> would have to do 128 searches. I gave parallelMultiSearcher a try and
> it was significantly slower than simply iterating through the indexes
> one at a time.
> Our new plan is to somehow have only one index per search machine and
> a larger main index stored on the master.
> What I'm interested to know is whether having one extremely large
> index for the master then splitting the index into several smaller
> indexes (if this is possible) would be better than having several
> smaller indexes and merging them on the search machines into one
> index.
> I would also be interested to know how others have divided up search
> work across a cluster.
> Thanks,
> Chris
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]

To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to