Re: TermInfosReader lazy term index reading

robert engels Fri, 02 Feb 2007 14:56:42 -0800

I think that is much more involved... I don't think there is an easyway to move a query between threads/pools once it has started unlessyou restart the entire query.

You might be able to dynamically lower the thread priority howeverwhen you detect a long query, so that smaller (faster) queries wouldhave priority.



On Feb 2, 2007, at 4:44 PM, Doron Cohen wrote:

robert engels <[EMAIL PROTECTED]> wrote on 02/02/2007 14:08:46:

You might be able to quantify the search request ahead of time (# of
terms, # of high frequency terms, etc.) and assign the request to the
appropriate pool (quick, normal, lengthy).

Then you can assign an appropriate # of threads to each pool.


Or, to avoid pre-computation, requests can first be assigned to a
'faster' queue, assuming they are short, and only later, if a
request turns out to be longer, it can me dynamically moved to a
'slower' queue, maybe less prioritized. (Similar I think to OS
job scheduling.) (Can have more than 2 queues.)

I wonder if there's danger that queueing queries would increase the
avg time-to-complete, even if the total time is reduced?


Most people understand that complex queries might take longer to
execute.


On Feb 2, 2007, at 4:01 PM, Yonik Seeley wrote:

On 2/2/07, robert engels <[EMAIL PROTECTED]> wrote:

For a process that is mostly CPU bound (which is the case withLucene
if the index is in the OS cache), having so many "active" threads
will actually hurt performance due to the context switching and
synchronization.


Sure... it certainly wasn't by design to have that many threads all
trying to do something.

Better to use a request queue / thread pool. (I
think I read somewhere that a good rule of thumb is 2x thenumber of
processors).


You might hit a scenario where a couple of threads are doing long
running queries, and that could lock out other queries that might
otherwise execute quickly.  But overall, it's not a bad idea.

If most of the searches are IO bound having so many disparate
requests will hurt performance as well since the disk heads will be
seeking all over the place and losing any locality of data that
Lucene provides (postings, sequental term reads, etc.).


We're not hitting disk... plenty of RAM.

-Yonik

---------------------------------------------------------------------

To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: TermInfosReader lazy term index reading

Reply via email to