[ 
https://issues.apache.org/jira/browse/LUCENE-1614?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713048#action_12713048
 ] 

Shai Erera commented on LUCENE-1614:
------------------------------------

About ConjunctionScorer.doNext() (this also applies to 
FilteredQuery.advanceToCommon()). I've changed it, following Mike's proposal to 
this:

{code}
  private boolean doNext() throws IOException {
    int first = 0;
    lastDoc = scorers[scorers.length - 1].docID();
    Scorer firstScorer;
    while ((firstScorer = scorers[first]).docID() < lastDoc) {
      lastDoc = firstScorer.advance(lastDoc);
      first = first == scorers.length - 1 ? 0 : first + 1;
    }
    return lastDoc != NO_MORE_DOCS;
  }
{code}

This indeed gets rid of 'more', the check for 'more' in the while condition and 
also the assignment to more. But now I think it may introduce a different 
inefficiency. Let's say that firstScorer.advance() returns NO_MORE_DOCS. The 
next scorer's docID is obviously smaller, and therefore the following call will 
be (first line in the 'while' body): *lastDoc = 
firstScorer.advance(Integer.MAX_VALUE);*. There are Scorers which cannot 
implement that efficiently.

With 'more' this would not have happened, since the while condition would 
terminate before that.

Are we sure that that's a worthwhile enhancement.

BTW, the code for FilteredQuery looks like this:

{code}
            while (scorerDoc != disiDoc) {
              if (scorerDoc < disiDoc) {
                if ((scorerDoc = scorer.advance(disiDoc)) == NO_MORE_DOCS) {
                  return NO_MORE_DOCS;
                }
              } else {
                if ((disiDoc = docIdSetIterator.advance(scorerDoc)) == 
NO_MORE_DOCS) {
                  return NO_MORE_DOCS;
                }
              }
            }
            return scorerDoc;
{code}

And I thought to change it to this:

{code}
while (scorerDoc != disiDoc) {
  if (scorerDoc < disiDoc) {
    scorerDoc = scorer.advance(disiDoc);
  } else {
  disiDoc = docIdSetIterator.advance(scorerDoc);
  }
}
return scorerDoc;
{code}

What do you think?

> Add next() and skipTo() variants to DocIdSetIterator that return the current 
> doc, instead of boolean
> ----------------------------------------------------------------------------------------------------
>
>                 Key: LUCENE-1614
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1614
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Search
>            Reporter: Shai Erera
>             Fix For: 2.9
>
>         Attachments: LUCENE-1614.patch, LUCENE-1614.patch, LUCENE-1614.patch
>
>
> See 
> http://www.nabble.com/Another-possible-optimization---now-in-DocIdSetIterator-p23223319.html
>  for the full discussion. The basic idea is to add variants to those two 
> methods that return the current doc they are at, to save successive calls to 
> doc(). If there are no more docs, return -1. A summary of what was discussed 
> so far:
> # Deprecate those two methods.
> # Add nextDoc() and skipToDoc(int) that return doc, with default impl in DISI 
> (calls next() and skipTo() respectively, and will be changed to abstract in 
> 3.0).
> #* I actually would like to propose an alternative to the names: advance() 
> and advance(int) - the first advances by one, the second advances to target.
> # Wherever these are used, do something like '(doc = advance()) >= 0' instead 
> of comparing to -1 for improved performance.
> I will post a patch shortly

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to