[
https://issues.apache.org/jira/browse/LUCENE-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12745735#action_12745735
]
Mark Miller commented on LUCENE-1821:
-------------------------------------
I'm still not a fan of giving access to the upper readers.
I think I could go for having the offset available with the appropriate
warnings.
I tried this out, and after adjusting all scorer, explains to carry the offset
as well, I ended up with one spot left:
{code}
public DocIdSet getDocIdSet(final IndexReader reader) throws IOException {
final Weight weight = query.weight(new IndexSearcher(reader));
return new DocIdSet() {
public DocIdSetIterator iterator() throws IOException {
return weight.scorer(reader, docBase?, true, false);
}
};
}
{code}
Trouble - in these cases, how do you pass the doc base? Its too much breakage
to pass it with the reader *everywhere*. You almost want a class that holds the
reader ref and the docBase, but you still break apis all over. You could
deprecate everything, but then you can't count on getting a good offset (would
have to guess 0? ).
> Weight.scorer() not passed doc offset for "sub reader"
> ------------------------------------------------------
>
> Key: LUCENE-1821
> URL: https://issues.apache.org/jira/browse/LUCENE-1821
> Project: Lucene - Java
> Issue Type: Bug
> Components: Search
> Affects Versions: 2.9
> Reporter: Tim Smith
> Fix For: 2.9
>
> Attachments: LUCENE-1821.patch
>
>
> Now that searching is done on a per segment basis, there is no way for a
> Scorer to know the "actual" doc id for the document's it matches (only the
> relative doc offset into the segment)
> If using caches in your scorer that are based on the "entire" index (all
> segments), there is now no way to index into them properly from inside a
> Scorer because the scorer is not passed the needed offset to calculate the
> "real" docid
> suggest having Weight.scorer() method also take a integer for the doc offset
> Abstract Weight class should have a constructor that takes this offset as
> well as a method to get the offset
> All Weights that have "sub" weights must pass this offset down to created
> "sub" weights
> Details on workaround:
> In order to work around this, you must do the following:
> * Subclass IndexSearcher
> * Add "int getIndexReaderBase(IndexReader)" method to your subclass
> * during Weight creation, the Weight must hold onto a reference to the passed
> in Searcher (casted to your sub class)
> * during Scorer creation, the Scorer must be passed the result of
> YourSearcher.getIndexReaderBase(reader)
> * Scorer can now rebase any collected docids using this offset
> Example implementation of getIndexReaderBase():
> {code}
> // NOTE: more efficient implementation can be done if you cache the result if
> gatherSubReaders in your constructor
> public int getIndexReaderBase(IndexReader reader) {
> if (reader == getReader()) {
> return 0;
> } else {
> List readers = new ArrayList();
> gatherSubReaders(readers);
> Iterator iter = readers.iterator();
> int maxDoc = 0;
> while (iter.hasNext()) {
> IndexReader r = (IndexReader)iter.next();
> if (r == reader) {
> return maxDoc;
> }
> maxDoc += r.maxDoc();
> }
> }
> return -1; // reader not in searcher
> }
> {code}
> Notes:
> * This workaround makes it so you cannot serialize your custom Weight
> implementation
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]