[ https://issues.apache.org/jira/browse/LUCENE-1821?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12745985#action_12745985 ]
Mark Miller edited comment on LUCENE-1821 at 8/21/09 7:14 AM: -------------------------------------------------------------- I certainly think IndexSearcher makes a lot more sense than Searchable or Searcher there - it somewhat handles the whole API break thing - your clearly limiting to an IndexSearcher, so its compatible with the current API and can be clearly explained with javadoc - I still worry about pushing the API towards things that leave MultiSearcher/Remote out in the cold on features. They are technically first class citizens. I think some expert warnings could sell me anyway though. I still have a problem with this though: {code} public DocIdSet getDocIdSet(final IndexReader reader) throws IOException { final Weight weight = query.weight(new IndexSearcher(reader)); return new DocIdSet() { public DocIdSetIterator iterator() throws IOException { return weight.scorer(reader, docBase?, true, false); } }; } {code} That scorer call should get the right IndexSearcher and I don't see how it can without breaking back compat on this method and passing an IndexSearcher too. * edit * And even if we fix this here, what about outside code doing the same thing? They won't get the right IndexSearcher. was (Author: markrmil...@gmail.com): I certainly think IndexSearcher makes a lot more sense than Searchable or Searcher there - it somewhat handles the whole API break thing - your clearly limiting to an IndexSearcher, so its compatible with the current API and can be clearly explained with javadoc - I still worry about pushing the API towards things that leave MultiSearcher/Remote out in the cold on features. They are technically first class citizens. I think some expert warnings could sell me anyway though. I still have a problem with this though: {code} public DocIdSet getDocIdSet(final IndexReader reader) throws IOException { final Weight weight = query.weight(new IndexSearcher(reader)); return new DocIdSet() { public DocIdSetIterator iterator() throws IOException { return weight.scorer(reader, docBase?, true, false); } }; } {code} That scorer call should get the right IndexSearcher and I don't see how it can without breaking back compat on this method and passing an IndexSearcher too. > Weight.scorer() not passed doc offset for "sub reader" > ------------------------------------------------------ > > Key: LUCENE-1821 > URL: https://issues.apache.org/jira/browse/LUCENE-1821 > Project: Lucene - Java > Issue Type: Bug > Components: Search > Affects Versions: 2.9 > Reporter: Tim Smith > Fix For: 2.9 > > Attachments: LUCENE-1821.patch > > > Now that searching is done on a per segment basis, there is no way for a > Scorer to know the "actual" doc id for the document's it matches (only the > relative doc offset into the segment) > If using caches in your scorer that are based on the "entire" index (all > segments), there is now no way to index into them properly from inside a > Scorer because the scorer is not passed the needed offset to calculate the > "real" docid > suggest having Weight.scorer() method also take a integer for the doc offset > Abstract Weight class should have a constructor that takes this offset as > well as a method to get the offset > All Weights that have "sub" weights must pass this offset down to created > "sub" weights > Details on workaround: > In order to work around this, you must do the following: > * Subclass IndexSearcher > * Add "int getIndexReaderBase(IndexReader)" method to your subclass > * during Weight creation, the Weight must hold onto a reference to the passed > in Searcher (casted to your sub class) > * during Scorer creation, the Scorer must be passed the result of > YourSearcher.getIndexReaderBase(reader) > * Scorer can now rebase any collected docids using this offset > Example implementation of getIndexReaderBase(): > {code} > // NOTE: more efficient implementation can be done if you cache the result if > gatherSubReaders in your constructor > public int getIndexReaderBase(IndexReader reader) { > if (reader == getReader()) { > return 0; > } else { > List readers = new ArrayList(); > gatherSubReaders(readers); > Iterator iter = readers.iterator(); > int maxDoc = 0; > while (iter.hasNext()) { > IndexReader r = (IndexReader)iter.next(); > if (r == reader) { > return maxDoc; > } > maxDoc += r.maxDoc(); > } > } > return -1; // reader not in searcher > } > {code} > Notes: > * This workaround makes it so you cannot serialize your custom Weight > implementation -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org