The reason there is a ParallelMusltiSearcher is because of the reasons given: if you are distributing your index across machines or hard drives, doing things in parallel is fater. I don't think RAID counts. RAID will do the parallelism for you with a single index.

I say I believe because its hard to be certain with Lucene...there are a lot of variables depending on your setup, and frankly, a lot my information is not hard core, but from memories of posts and work and playing, etc. I don't want to sound more definitive than I am. On a normal system, you will find ParallelMultiSearcher slower. If your distributing your index over multiple machines or hard drives, you will find it faster.

This mailing list is one of the best places for best practices: http://www.nabble.com/Lucene-f44.html

Search it, and you will find discussions on this very topic.

The next best place to look is the FAQ at the Lucene home page. There is a growing list of advice and best practices. Your still not going to find a lot that is definitive. A ton depends on your particular circumstances. There has been a lot of work done lately to make sure Lucene runs the best it can using out of the box defaults, but this can only go so far. Lucene is *very* configurable, and the best configuration can depend on very specific setups. The best advice is to try and test, try and test. If your a nut for getting things perfect, there is an excellent benchmark system in contrib that will help your efforts tremendously.

- Mark



Timo Nentwig wrote:
On Tuesday 01 January 2008 22:24:53 Mark Miller wrote:
I believe that, in general, you'll find that ParallelMultiSearcher is

You believe or you know? And if you know why is there a ParallelMultiSearcher at all? :)

And I still wonder why everybody belives and finds out on his own why isn't there are comprehensive collection of common knowledge and best practices for lucene out there?

much slower than just using a MultiSearcher. ParralelMultiSeacher is of
use when you can put the different indexes on separate hard drives or

Well, the index might not be on different HDs but *of couse* we're talking about multiple hard drives (at least some RAID, in my case it's some expensive netapp, however I don't know which one exactely but I can find out...).

even better, separate systems (using RMI or something).

- Mark

Timo Nentwig wrote:
On Tuesday 01 January 2008 21:06:06 Mark Miller wrote:
The main reason to use a single IndexReader is because its very time
consuming to open an IndexReader. If your index is pretty static, maybe
Yes, it takes quite some time to build it and it's not changed but
rebuilt from scratch.

Perhaps, in some esoteric case, multiple readers is the right idea
I recently talked to a guy how stated that they'd solved their
performance issues by breaking up the index into multiple sub-indices and
searching them in parallel (probably using ParallelMultiSearcher)...hmm,
well, I've had (and still have) my doubts but on the other hand what's
the benefit of ParallelMultiSearcher if it doesn't scale better than
searching a monolithic index?

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to