iprithv commented on code in PR #16146:
URL: https://github.com/apache/lucene/pull/16146#discussion_r3320544167
##########
lucene/core/src/java/org/apache/lucene/search/FilterLeafCollector.java:
##########
@@ -42,6 +46,16 @@ public void collect(int doc) throws IOException {
in.collect(doc);
}
+ @Override
Review Comment:
`FilterLeafCollector` is meant to let subclasses override `collect(int)` and
handle each doc. but with these new methods (`collect(DocIdStream)` and
`collectRange`), it just forwards everything directly to `in`. that means the
subclass’s `collect(int)` never gets called. so any existing subclass that only
overrides `collect(int)` will silently break when bulk collection is used.
I see javadoc saying subclasses should override the new methods too, but
that’s not really safe. old code won’t do that, and things will just start
giving wrong results without any error.
a safer default would be to break the batch calls into per-doc calls, like:
```java id="safex1"
@Override
public void collect(DocIdStream stream) throws IOException {
stream.forEach(this::collect);
}
@Override
public void collectRange(int min, int max) throws IOException {
for (int doc = min; doc < max; doc++) {
collect(doc);
}
}
```
also noticed the subclasses updated already call `this::collect`. but
`FilterLeafCollector` itself doesn’t, so anything not updated will behave
incorrectly.
I think this could silently break existing code.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]