[jira] Commented: (LUCENE-1821) Weight.scorer() not passed doc offset for "sub reader"

Mark Miller (JIRA) Tue, 18 Aug 2009 18:29:39 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12744827#action_12744827
 ]


Mark Miller commented on LUCENE-1821:
-------------------------------------

bq. I think Tim's got a valid point though about wanting an ordinal value 
across the entire index ...

I don't disagree about wanting them at all. Hes using them for a neat purpose. 

bq. he's not using external ids, he's using the internal lucene docIds

If he were respecting the internal ids, you wouldn't need to calculate the 
multi-reader id. Hes essentially caching the multi-reader ids - thats the same 
as using a filter that always allows doc 0 to pass - its using the internal ids 
externally. To use the ids correctly, you get a reader and an id space that 
starts at 0 for that reader. If you want to use the whole reader, you should 
work with the multi-reader. You can use the multi-reader without breaking it 
apart here as well if you need to.

I think its a slippery slope - we start having to support both the segment ids, 
plus the multi-reader ids. And as we work on real-time, we will have to count 
on users caching that way - I think its better to try and work all of our 
support towards per segment.

I'll leave it for smarter people to discuss for now - but I don't think its the 
right path. He can essentially do what he needs without built in support, and 
personally I think thats the way to go. I think its great that right now, other 
than the sorting/hitcollector, things don't know about the sub reader breakout.

> Weight.scorer() not passed doc offset for "sub reader"
> ------------------------------------------------------
>
>                 Key: LUCENE-1821
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1821
>             Project: Lucene - Java
>          Issue Type: Bug
>          Components: Search
>    Affects Versions: 2.9
>            Reporter: Tim Smith
>
> Now that searching is done on a per segment basis, there is no way for a 
> Scorer to know the "actual" doc id for the document's it matches (only the 
> relative doc offset into the segment)
> If using caches in your scorer that are based on the "entire" index (all 
> segments), there is now no way to index into them properly from inside a 
> Scorer because the scorer is not passed the needed offset to calculate the 
> "real" docid
> suggest having Weight.scorer() method also take a integer for the doc offset
> Abstract Weight class should have a constructor that takes this offset as 
> well as a method to get the offset
> All Weights that have "sub" weights must pass this offset down to created 
> "sub" weights

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] Commented: (LUCENE-1821) Weight.scorer() not passed doc offset for "sub reader"

Reply via email to