[jira] [Commented] (LUCENE-4314) The specification of DocIdSetIterator is needlessly ambiguous.

Franco Callari (JIRA) Mon, 20 Aug 2012 06:09:42 -0700

    [ 
https://issues.apache.org/jira/browse/LUCENE-4314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13437841#comment-13437841
 ]


Franco Callari commented on LUCENE-4314:
----------------------------------------

Hi Robert, I think you misunderstand my point. 

It is not about advancing backwards (which makes no sense whatsoever in the 
context of a unidirectional iterator), it's about what the iterator should do 
when it is "asked" to advance backwards. 

The current status is not that the behavior is unspecified. Rather it is 
specified in two contradictory ways. That is, the current specification allows 
iterator implementations to either (a) stay put, or (b) move one step forward 
anyway. This means that every single client of DocIdSetIterator must check for 
both eventualities, i.e. must contain code such as:

   int doc = iter.docID();
   while (doc != -1 && doc != NO_MORE_DOCS && doc < target) {
      doc = iter.avance(target);
      /* do useful stuff */
   }

Note the comparison doc < target in the if() clause. 

One may call an API specification that forces you to go through those hoops a 
good one. I think it's error-prone - so prone, in fact, that PhraseScorer was 
doing picturesque, yet infinite, loops when applied to a "stay put" iterator. 

To reiterate: it is not a matter of fixing "the single broken Scorer" - it's a 
about a poor API specification that forces scorers to be needlessly more 
complicated than they should be, under penalty of breaking for some correct 
implementations of the iterator API.

Just specify one behavior and the problem is solved. Having the word "opt" in a 
specification is a big red flag.

I won't comment on your point about docID() on an unpositioned iterator, since 
it does seem germane to this bug report. 
                
> The specification of DocIdSetIterator is needlessly ambiguous.
> --------------------------------------------------------------
>
>                 Key: LUCENE-4314
>                 URL: https://issues.apache.org/jira/browse/LUCENE-4314
>             Project: Lucene - Core
>          Issue Type: Improvement
>          Components: core/search
>    Affects Versions: 3.6.1, 4.0-BETA
>         Environment: All
>            Reporter: Franco Callari
>              Labels: index,, iterators
>
> Quoth Lucene at org.apache.lucene.search.DocIdSetIterator.advance:
>      "Advances to the first beyond (see NOTE below) the current whose document
>       number is greater than or equal to <i>target</i>. [...]
>       NOTE:</b> when <code> target &le; current</code> implementations may opt
>       not to advance beyond their current {@link #docID()}."
> However, the same specification contradictorily states that advance must 
> behave as if written:
>      int advance(int target) {
>        int doc;
>        while ((doc = nextDoc()) < target) {}
>        return doc;
>      }
> That is, with at least one call to nextDoc() always made, unconditionally.
> This ambiguity can lead to unexpected behavior. In fact, arguably every user 
> of this interface that does not test after every call whether the iterator 
> has exhausted AND has advanced is incorrect.
> For example, I myself had one experimental implementation (coded against a 
> previous Lucene release) that caused an infinite loop in PhraseScorer.java 
> because, following the above specification, it "opted" not to move the 
> iterator when advance(target) was called with target < current.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (LUCENE-4314) The specification of DocIdSetIterator is needlessly ambiguous.

Reply via email to