[jira] Commented: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector

Mark Miller (JIRA) Sat, 17 Jan 2009 19:31:24 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12664924#action_12664924
 ]


Mark Miller commented on LUCENE-1483:
-------------------------------------

Whoops. As usual, getting ahead of myself. Perhaps there won't be big gains 
with those other queries.

While there is  a big difference between searching a single segment vs 
multisegments for these things, we already knew about that - thats why you 
optimize.

Even when we search each individual indexsearcher with a single hit queue (this 
patch), we have to load the field cache for each segment and do a seek, the 
same as the old method with the multireader. One seek for each reader for each 
term.

However, it still appears that we get to do WAY fewer seeks this way - for my 
last example, maybe like 40000 seeks. Quite a bit better than 1.5 million. But 
why?

Perhaps its because we can use only the terms from each segment. Then, rather 
than num readers X total unique terms seeks, you have the sum of unique terms 
per index seeks. That could count for a lot of saved seeks, but I am not sure 
that it accounts for all of them (1.5 mil to 40,000 is quite the drop). Beyond 
all the saved seeks though, I imagine its more efficient to hit the same reader 
n times, than to hit each reader round robin n times. Something makes me think 
there is something else that allows us to avoid seeks, but not sure what yet...

> Change IndexSearcher multisegment searches to search each individual segment 
> using a single HitCollector
> --------------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1483
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1483
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9
>            Reporter: Mark Miller
>            Priority: Minor
>         Attachments: LUCENE-1483-partial.patch, LUCENE-1483.patch, 
> LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
> LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
> LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
> LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
> LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
> LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
> LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
> sortBench.py, sortCollate.py
>
>
> FieldCache and Filters are forced down to a single segment reader, allowing 
> for individual segment reloading on reopen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1483) Change IndexSearcher multisegment searches to search each individual segment using a single HitCollector

Reply via email to