[ https://issues.apache.org/jira/browse/LUCENE-1652?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12712549#action_12712549 ]
Michael McCandless commented on LUCENE-1652: -------------------------------------------- I hate to say this (heaping yet more limitations on our back-compat "constraints"), but... it makes me nervous making a runtime-only semantic change to an API (DISI in this case), even in 3.0. Likewise, the "doc() returns -1 before next/advance have been called" would be a runtime only change. If we did these, you could upgrade to 2.9, fix all deprecations, then upgrade to 3.0, recompile just fine, and hit weird problems since Lucene is suddenly expecting different behavior from your DISI.doc(). Such "semantics-only" changes invite subtle bugs. I'd much prefer to find a migration path that's based on static checking, ie you get catastrophic compilation errors if you've failed to migrate. If external code is iterating through a Lucene DISI, these semantics-only changes are harmless, since we are only defining behavior "outside" the bounds of what's currently defined. But if Lucene is interacting w/ an external DISI, then we are in trouble. However, it's not clear to me what's the best way to make this migration "catastrophic" ... maybe we add DISI.document(), with the new semantics, and with a default impl in DISI that overlays our new semantics? (And deprecate doc()). We could do this for 2.9. > Enhancements to Scorers following the changes to DocIdSetIterator > ----------------------------------------------------------------- > > Key: LUCENE-1652 > URL: https://issues.apache.org/jira/browse/LUCENE-1652 > Project: Lucene - Java > Issue Type: Improvement > Components: Search > Reporter: Shai Erera > Fix For: 3.0 > > > In LUCENE-1614, we changed the semantics of DocIdSetIterator's methods to > return a sentinel NO_MORE_DOCS (= Integer.MAX_VALUE) when the iterator has > exhausted. Due to backward compatibility issues, we couldn't implement that > semantics in doc(). Therefore this issue, which can be introduced in 3.0 only > will: > # Implement the new semantics in all extending classes, such that doc() will > return NO_MORE_DOCS when the iterator has exhausted. > # Change BooleanScorer to take advantage of that by removing sub.done from > SubScorer and operate under the assumption that NO_MORE_DOCS is larger than > any doc ID (Integer.MAX_VALUE). > # Change ConjunctionScorer to operate under the same assumptions and remove > 'more'. > # Change ReqExclScorer to not rely on reqScorer in doc(), since the latter > may be null. > # Make more changes to ConjunctionScorer's init() and remove 'firstTime' to > improve the performance of nextDoc(), score(), advance(). > # Add start()/finish() to DISI? > A snippet from LUCENE-1614 regarding the change in BooleanScorer > {code} > int doc = sub.done ? -1 : scorer.doc(); > while (!sub.done && doc < end) { > sub.collector.collect(doc); > doc = scorer.nextDoc(); > sub.done = doc < 0; > } > {code} > To this: > {code} > int doc = scorer.doc(); > while (doc < end) { > sub.collector.collect(doc); > doc = scorer.nextDoc(); > } > {code} > And in ConjunctionScorer, change this: > {code} > while (more && (firstScorer=scorers[first]).doc() < > (lastDoc=lastScorer.doc())) { > more = firstScorer.advance(lastDoc) >= 0; > lastScorer = firstScorer; > first = (first == (scorers.length-1)) ? 0 : first+1; > } > return more; > {code} > To this: > {code} > while ((firstScorer=scorers[first]).doc() < (lastDoc=lastScorer.doc())) { > firstScorer.advance(lastDoc); > lastScorer = firstScorer; > first = (first == (scorers.length-1)) ? 0 : first+1; > } > return lastDoc != DOC_SENTINEL; > {code} -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org