[
https://issues.apache.org/jira/browse/LUCENE-4314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13437854#comment-13437854
]
Robert Muir commented on LUCENE-4314:
-------------------------------------
I think i understand it enough: what I'm saying is when consumers "abuse" the
api in any way (calling stuff on unpositioned iterators, asking to advance
backwards, any of this), I think we should try to leave the behavior undefined.
Otherwise it makes it much harder on these iterators.
{quote}
To reiterate: it is not a matter of fixing "the single broken Scorer" - it's a
about a poor API specification that forces scorers to be needlessly more
complicated than they should be, under penalty of breaking for some correct
implementations of the iterator API.
{quote}
But it really is. If I add checks for this situation, basically its only
(sloppy) PhraseScorer that does this crazy stuff (i think some spans impl might
too). But its not done by exact phrase scorer, conjunction scorer, or any of
that. In fact this sloppy phrase scorer does things like call advance(-1),
which in my eyes is totally bogus... this is what we should fix.
{noformat}
Index:
lucene/test-framework/src/java/org/apache/lucene/index/AssertingAtomicReader.java
===================================================================
---
lucene/test-framework/src/java/org/apache/lucene/index/AssertingAtomicReader.java
(revision 1375005)
+++
lucene/test-framework/src/java/org/apache/lucene/index/AssertingAtomicReader.java
(working copy)
@@ -241,6 +241,7 @@
@Override
public int advance(int target) throws IOException {
assert state != DocsEnumState.FINISHED : "advance() called after
NO_MORE_DOCS";
+ assert target > docID() : "consumer asked to advance backwards: " +
target + " from: " + docID();
int advanced = super.advance(target);
assert advanced >= 0 : "invalid doc id: " + advanced;
assert advanced >= target : "backwards advance from: " + target + " to:
" + advanced;
@@ -296,6 +297,7 @@
@Override
public int advance(int target) throws IOException {
assert state != DocsEnumState.FINISHED : "advance() called after
NO_MORE_DOCS";
+ assert target > docID() : "consumer asked to advance backwards: " +
target + " from: " + docID();
int advanced = super.advance(target);
assert advanced >= 0 : "invalid doc id: " + advanced;
assert advanced >= target : "backwards advance from: " + target + " to:
" + advanced;
{noformat}
> The specification of DocIdSetIterator is needlessly ambiguous.
> --------------------------------------------------------------
>
> Key: LUCENE-4314
> URL: https://issues.apache.org/jira/browse/LUCENE-4314
> Project: Lucene - Core
> Issue Type: Improvement
> Components: core/search
> Affects Versions: 3.6.1, 4.0-BETA
> Environment: All
> Reporter: Franco Callari
> Labels: index,, iterators
>
> Quoth Lucene at org.apache.lucene.search.DocIdSetIterator.advance:
> "Advances to the first beyond (see NOTE below) the current whose document
> number is greater than or equal to <i>target</i>. [...]
> NOTE:</b> when <code> target ≤ current</code> implementations may opt
> not to advance beyond their current {@link #docID()}."
> However, the same specification contradictorily states that advance must
> behave as if written:
> int advance(int target) {
> int doc;
> while ((doc = nextDoc()) < target) {}
> return doc;
> }
> That is, with at least one call to nextDoc() always made, unconditionally.
> This ambiguity can lead to unexpected behavior. In fact, arguably every user
> of this interface that does not test after every call whether the iterator
> has exhausted AND has advanced is incorrect.
> For example, I myself had one experimental implementation (coded against a
> previous Lucene release) that caused an infinite loop in PhraseScorer.java
> because, following the above specification, it "opted" not to move the
> iterator when advance(target) was called with target < current.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]