[
https://issues.apache.org/jira/browse/LUCENE-4314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13437841#comment-13437841
]
Franco Callari commented on LUCENE-4314:
----------------------------------------
Hi Robert, I think you misunderstand my point.
It is not about advancing backwards (which makes no sense whatsoever in the
context of a unidirectional iterator), it's about what the iterator should do
when it is "asked" to advance backwards.
The current status is not that the behavior is unspecified. Rather it is
specified in two contradictory ways. That is, the current specification allows
iterator implementations to either (a) stay put, or (b) move one step forward
anyway. This means that every single client of DocIdSetIterator must check for
both eventualities, i.e. must contain code such as:
int doc = iter.docID();
while (doc != -1 && doc != NO_MORE_DOCS && doc < target) {
doc = iter.avance(target);
/* do useful stuff */
}
Note the comparison doc < target in the if() clause.
One may call an API specification that forces you to go through those hoops a
good one. I think it's error-prone - so prone, in fact, that PhraseScorer was
doing picturesque, yet infinite, loops when applied to a "stay put" iterator.
To reiterate: it is not a matter of fixing "the single broken Scorer" - it's a
about a poor API specification that forces scorers to be needlessly more
complicated than they should be, under penalty of breaking for some correct
implementations of the iterator API.
Just specify one behavior and the problem is solved. Having the word "opt" in a
specification is a big red flag.
I won't comment on your point about docID() on an unpositioned iterator, since
it does seem germane to this bug report.
> The specification of DocIdSetIterator is needlessly ambiguous.
> --------------------------------------------------------------
>
> Key: LUCENE-4314
> URL: https://issues.apache.org/jira/browse/LUCENE-4314
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/search
> Affects Versions: 3.6.1, 4.0-BETA
> Environment: All
> Reporter: Franco Callari
> Labels: index,, iterators
>
> Quoth Lucene at org.apache.lucene.search.DocIdSetIterator.advance:
> "Advances to the first beyond (see NOTE below) the current whose document
> number is greater than or equal to <i>target</i>. [...]
> NOTE:</b> when <code> target ≤ current</code> implementations may opt
> not to advance beyond their current {@link #docID()}."
> However, the same specification contradictorily states that advance must
> behave as if written:
> int advance(int target) {
> int doc;
> while ((doc = nextDoc()) < target) {}
> return doc;
> }
> That is, with at least one call to nextDoc() always made, unconditionally.
> This ambiguity can lead to unexpected behavior. In fact, arguably every user
> of this interface that does not test after every call whether the iterator
> has exhausted AND has advanced is incorrect.
> For example, I myself had one experimental implementation (coded against a
> previous Lucene release) that caused an infinite loop in PhraseScorer.java
> because, following the above specification, it "opted" not to move the
> iterator when advance(target) was called with target < current.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]