[jira] Commented: (LUCENE-1483) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders

Michael McCandless (JIRA) Fri, 12 Dec 2008 05:44:13 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12656014#action_12656014
 ]


Michael McCandless commented on LUCENE-1483:
--------------------------------------------


Yeah, ugg.  This is the nature of "progress"!  It's not exactly a
straight line from point A to B :) Lots of fits & starts, dead ends,
jumps, etc.

We could simply offer both ("collect into single pqueue but pay high
warming cost" or "collect into separate pqueues, then merge, and pay
low warming cost"), but that sure is an annoying choice to have to
make.

Oh, here's another idea: do separate pqueues (again!), but after the
first segment is done, grab the values for the worst scoring doc in
the pqueue (assuming the queue filled up to its numHits) and use this
as the "cutoff" before inserting into the next segment's pqueue.

In grabbing that cutoff we'd have to 1) map ord->value for segment 1,
then 2) map value->ord for segment 2, then 3) use that cutoff for
segment 2.  (And likewise for all segment N -> N+1).

I think this'd greatly reduce the number of inserts & comparisons done
in subsequent queues because it mimics how a single pqueue behaves:
you don't bother re-considering hits that won't be globally
competitive.

We could also maybe merge after each segment is processed; that way
the cutoff we carry to the next segment is "true" so we'd reduce
comparisons even further.

Would this work?  Let's try to think hard before writing code :)


> Change IndexSearcher to use MultiSearcher semantics for multiple subreaders
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-1483
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1483
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9
>            Reporter: Mark Miller
>            Priority: Minor
>         Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
> LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
> LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch
>
>
> FieldCache and Filters are forced down to a single segment reader, allowing 
> for individual segment reloading on reopen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1483) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders

Reply via email to