[jira] Commented: (LUCENE-1483) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders

Michael McCandless (JIRA) Sun, 14 Dec 2008 10:30:07 -0800

    [ 
https://issues.apache.org/jira/browse/LUCENE-1483?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12656431#action_12656431
 ]


Michael McCandless commented on LUCENE-1483:
--------------------------------------------


{quote}
> I've started working through the tests and I've corrected a small ordering 
> problem and added most of the other basic types.
{quote}

Great!

{quote}
> Then we just have to clear up the setbase stuff.
{quote}

Yeah I think we should remove that (and use setNextReader instead).

{quote}
> That makes converting Strings to ords very difficult right? 
{quote}

Right (this was the challenging example Yonik brought up above).

How about something like this: at any given time, the slots are filled
with an instance that has 1) the ord (that "matches" the current
reader) and 2) the actual value.  When transitioning readers you have
to remap all ords to the new reader (but keep the value unchagned):
for each slot, you take its String value look it up in the new reader
w/ binary search.  If it's present, assign it the corresponding ord.
If it's not present, it must fall between ord X and X+1 so assign it
an ord of X+0.5.

Then proceed like normal, since all ords are now "matched" to your
current reader.  You compare with ords, never with values.

The one caveat is... we have to take care with precision.  A float
X+0.5 won't be precise enough.  We could use double, or, we could use
long and "remap" all valid ords to be 2X (even) their value, and all
"in between" ords (carried over from previous reader) to be odd
values.  I think we'd need long for this since in the worst case, if
you have max number of docs and each doc has unique string you would
have 2^31 unique ords to represent.



> Change IndexSearcher to use MultiSearcher semantics for multiple subreaders
> ---------------------------------------------------------------------------
>
>                 Key: LUCENE-1483
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1483
>             Project: Lucene - Java
>          Issue Type: Improvement
>    Affects Versions: 2.9
>            Reporter: Mark Miller
>            Priority: Minor
>         Attachments: LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
> LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, 
> LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch, LUCENE-1483.patch
>
>
> FieldCache and Filters are forced down to a single segment reader, allowing 
> for individual segment reloading on reopen.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

[jira] Commented: (LUCENE-1483) Change IndexSearcher to use MultiSearcher semantics for multiple subreaders

Reply via email to