Re: Query performance in Lucene 4.x

Desidero Tue, 01 Oct 2013 14:38:54 -0700

For anyone who was wondering, this was actually resolved in a different
thread today. I misread the information in the
IndexSearcher(IndexReader,ExecutorService) constructor documentation - I
was under the impression that it was submitting a thread for each index
shard (MultiReader wraps 20 shards, so 20 tasks) but it was really
submitting a task for each segment within each shard (20 shards * ~10
segments = ~200 tasks) which is horrible. Since my index changes
infrequently, I'm using forceMerge(1) before sending out updated indexes to
the slave servers. Without any extra tuning (threads, # of shards, etc)
I've gone from ~2900 requests per minute to ~10k requests per minute.


Thanks to Adrien and Mike for the clarification and Benson for bringing up
the question that led to my answer.

I'm still pretty new to Lucene so I have a lot of poking around to do, but
I'm going to try to implement the "virtual segment" concept that Mike
mentioned. It'll be really helpful for those of us who want parallelism
within queries and don't want to forceMerge.


On Fri, Sep 27, 2013 at 9:55 AM, Desidero <desid...@gmail.com> wrote:

> Erick,
>
> Thank you for responding.
>
> I ran tests using both compressed fields and uncompressed fields, and it
> was significantly slower with uncompressed fields. I looked into the lazy
> field loading per your suggestion, but we don't get any values from the
> returned Documents until the result set has been appropriately reduced.
> Since we only store one retrievable field and we always need to get it, it
> doesn't save any time loading it lazily.
>
> I'll try running a test without loading any fields just to see how it
> affects performance and let you know how that goes.
>
> Regards,
> Matt
>
>
> On Fri, Sep 27, 2013 at 8:01 AM, Erick Erickson 
> <erickerick...@gmail.com>wrote:
>
>> Hmmm, since 4.1, fields have been stored compressed by default.
>> I suppose it's possible that this is a result of
>> compressing/uncompressing.
>>
>> What happens if
>> 1> you enable lazy field loading
>> 2> don't load any fields?
>>
>> FWIW,
>> Erick
>>
>> On Thu, Sep 26, 2013 at 10:55 AM, Desidero <desid...@gmail.com> wrote:
>> > A quick update:
>> >
>> > In order to confirm that none of the standard migration changes had a
>> > negative effect on performance, I ported my Lucene 4.x version back to
>> > Lucene 3.6.2 and kept the newer API rather than using the custom
>> > ParallelMultiSearcher and other deprecated methods/classes.
>> >
>> > Performance in 3.6.2 is even faster than before (~2900 requests/min
>> with 4.x
>> > vs ~6200 requests/min with 3.6.2), so none of my code changes should be
>> > causing the difference. It seems to be something Lucene is doing under
>> the
>> > covers.
>> >
>> > Again, if there's any other information if I can provide to help
>> determine
>> > what's going on, please let me know.
>> >
>> > Thanks,
>> > Matt
>> >
>> >
>> >
>> > ---------------------------------------------------------------------
>> > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> > For additional commands, e-mail: java-user-h...@lucene.apache.org
>> >
>>
>> ---------------------------------------------------------------------
>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org
>> For additional commands, e-mail: java-user-h...@lucene.apache.org
>>
>>
>

Re: Query performance in Lucene 4.x

Reply via email to