Hi I think we can make some optimization to DocIdSetIterator. Today, it defines next() and skipTo(int) which return a boolean. I've checked the code and it looks like almost always when these two are called, they are followed by a call to doc().
I was thinking that if those two returned the doc Id they are at, instead of boolean, that will save the call to doc(). Those that use these can: * Compare doc to a NO_MORE_DOCS constant (set to -1), to understand there are no more docs in this iterator. * If skipTo() is called, compare the 'target' to the returned Id, and if they are not the same, save it so that the next skipTo is requested, they don't perform it if the returned Id is greater than the target. If it's not possible to save it, they can call doc() to get that information. The way I see it, the impls that will still need to call doc() will lose nothing. All we'll do is change the 'if' from comparing a boolean to comparing ints (even though that's a bit less efficient than comparing booleans). The impls that call doc() just because all they have in hand is a boolean, will gain. Obviously we can't change those methods' signature, so we can deprecate them and intorudce nextDoc() and skipToDoc(int target). We should still keep doc() around though. What do you think? If you agree to this change, I can add them to 1593, or create a new issue (I prefer the latter so that 1593 will be focused on the changes to Collectors). Shai