magibney commented on code in PR #1236:
URL: https://github.com/apache/solr/pull/1236#discussion_r1047173635
##########
solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java:
##########
@@ -2203,15 +2203,19 @@ public DocListAndSet getDocListAndSet(Query query, Sort
lsort, int offset, int l
private DocList constantScoreDocList(int offset, int length, DocSet docs) {
final int size = docs.size();
- if (length == 0 || size <= offset) {
- return new DocSlice(0, 0, new int[0], null, size, 0f,
TotalHits.Relation.EQUAL_TO);
- }
- final DocIterator iter = docs.iterator();
- for (int i = offset; i > 0; i--) {
- iter.nextDoc(); // discard
- }
- final int returnSize = Math.min(length, size - offset);
+
+ // NOTE: it would be possible to special-case `length == 0 || size <=
offset` here
+ // (returning a DocList backed by an empty array) -- but the cases that
would practically
+ // benefit from doing so would be extremely unusual, and likely
pathological:
+ // 1. length==0 in conjunction with offset>0 (why?)
+ // 2. specifying offset>size (paging beyond end of results)
+ // This would require special consideration in dealing with cache handling
(and generation
+ // of the final DocList via `DocSlice.subset(int, int)`), and it's just
not worth it.
+
+ final int returnSize = Math.min(offset + length, size);
Review Comment:
That's an indirect consequence of the essence of the bug. In SOLR-14765 I
incorrectly assumed it was ok to trim the result list at the point of initial
DocSlice creation -- i.e., if you have offset=200, len=10, that it should be
possible to return a DocSlice backed by a 10-element int[]. But in fact there's
an assumption, in all cases, that index 0 of the backing docs int[] is the top
doc (and consequently we always need to populate the array out as far as
docs[offset+length].
So at this stage, DocSlice is initialized with an `offset=0` ctor arg, where
offset reflects both the offset into the array _and_ the offset of the first
element of the array . The _actual_ returned docList is created subsequently by
calling initialDocList.subset(actualRequestedOffset, actualRequestedLength). In
fact, in the codebase the initial DocSlice is almost always created with
offset=0 -- I toyed with the idea of actually enforcing this, making a public
ctor that always sets offset=0 and length=docs.length, and only setting
explicit offset/len via a private ctor (invoked by DocSlice.subset(int, int)).
But it became more complex than I'd hoped, so I figured to keep this change as
small as possible.
When initial DocSlices are created in this way, it simplifies the logic
about how they can be cached and repurposed to serve other ranges of requested
docs. The special-casing of length=0 I think would also have messed with this,
hence the removal of the special-casing. Worst-case consequence of removing the
special-casing is that if somebody requests offset>0, length=0, we now have to
build and populate an array of `offset` elements, or if someone requests
offset>docSet.size(), length>0, we now have to build and populate an array of
`docSet.size()` elements. Neither of these would actually be used to satisfy
the immediate request, but both cases are weird/pathological edge cases so
probably not worth special-casing, given the hoops we'd have to jump through,
e.g. to avoid caching them, etc.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]