OK, that makes sense: So the example of Yonik should be interpreted like
this (I think this is the optimal solution as it does not use an additional
if-clause to check if the iteration has already started):

 

class SliceDocIdSetIterator extends DocIdSetIterator {

 private int doc=-1,act,last;

 

 public SliceDocIdSetIterator(int first, int last) {

   this.act=first-1; this.last=last;

 }

 

 public int docID() {

   return doc;

 }

 

 public int nextDoc() throws IOException {

   if (++act>last) act=NO_MORE_DOCS;

   return doc = act;

 }

 

 public int advance(int target) throws IOException {

   act=target;

   if (act>last) act=NO_MORE_DOCS;

   return doc = act;

 }

}

 

 

-----
Uwe Schindler
H.-H.-Meier-Allee 63, D-28213 Bremen
http://www.thetaphi.de
eMail: u...@thetaphi.de

  _____  

From: Shai Erera [mailto:ser...@gmail.com] 
Sent: Thursday, July 16, 2009 5:04 PM
To: java-dev@lucene.apache.org; yo...@lucidimagination.com
Subject: Re: DISI semantics

 

Uwe / Yonik, DISI's class javadoc states this:

"Implementations of this class are expected to consider {...@link
Integer#MAX_VALUE} as an invalid value."

Therefore "last" cannot be set to MAX_VAL in the above example, if it wants
to be a DISI at least.

Phew ... that was a long issue. I was able to find the conversation on -1
vs. any value before the first there:
https://issues.apache.org/jira/browse/LUCENE-1614?focusedCommentId=12714298
<https://issues.apache.org/jira/browse/LUCENE-1614?focusedCommentId=12714298
&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#act
ion_12714298>
&page=com.atlassian.jira.plugin.system.issuetabpanels%3Acomment-tabpanel#act
ion_12714298

That link points to my response to Mike w/ why I think it'd be wrong to
relax the policy of docId(). You can read 1-2 comments up and down to get
the full conversation.

In short, if we don't document clearly what is returned by docId() before
the iteration started, it will be hard for a code which receives a DISI to
determine whether to call nextDoc() or start by collecting what docId()
returns. Can be worked around though, but I think the API is clear now and
does not leave room for interpretation.

Shai

On Thu, Jul 16, 2009 at 5:29 PM, Yonik Seeley <yo...@lucidimagination.com>
wrote:

On Wed, Jul 15, 2009 at 6:55 PM, Michael
McCandless<luc...@mikemccandless.com> wrote:
> I believe we debated allowing the DISI to return any docID less than
> its first real docID, not only -1, as you've done here, but I think
> Shai found something wrong with that IIRC... but I can't find this
> discussion.  Shai do you remember / can you find this past discussion
> / am I just hallucinating?

I don't know if it exists in Lucene, but I guess I can see the benefit
of only having -1 or NO_MORE_DOCS.
Consider a simplified ConjunctionScorer that didn't do anything in the
constructor but simply skipped one iterator and then did the logic of
doNext() until they all matched.  One could get a false hit with my
theoretical SliceDocIdSetIterator above.


-Yonik
http://www.lucidimagination.com

---------------------------------------------------------------------

To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

 

Reply via email to