FYI,
For a process that is mostly CPU bound (which is the case with Lucene
if the index is in the OS cache), having so many "active" threads
will actually hurt performance due to the context switching and
synchronization. Better to use a request queue / thread pool. (I
think I read somewhere that a good rule of thumb is 2x the number of
processors).
If most of the searches are IO bound having so many disparate
requests will hurt performance as well since the disk heads will be
seeking all over the place and losing any locality of data that
Lucene provides (postings, sequental term reads, etc.).
There are some excellent academic papers I just came across on high-
performance parallel disk based sorting and many of the techniques/
concerns apply to Lucene.
Robert
On Feb 2, 2007, at 3:38 PM, Yonik Seeley wrote:
On 2/2/07, Doug Cutting <[EMAIL PROTECTED]> wrote:
Yonik Seeley wrote:
> I ran across a situation where a great number of threads were
blocked on
> ensureIndexIsRead(), even after it had already been loaded.
That sounds bizarre. A sync block that tests a field for non-null
shouldn't tie things up much, I wouldn't think.
There were hundreds of threads all blocked on the same lock.
I think synchronization can become expensive under heavy contention,
regardless of how lightweight the code inside.
It's obviously not the root cause of the problem... the query
structure was very expensive (a range query covering most documents
that didn't get pulled out into a Filter), but it still could be an
area of improvement.
I'm going to try and see if I can duplicate it, then see what effect
removing the synchronization has.
Are you sure that one
of the threads wasn't actually reading the index?
Yep. We've seen the same thing with older versions of Lucene when
multiple threads tried to sort on the same field and there was massive
contention from everyone trying to generate the same entry.
Or perhaps some other
method also synchronizes on the same object?
Good question... I only checked TermInfosReader itself.
-Yonik
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]