[GitHub] [solr] magibney commented on a diff in pull request #1236: SOLR-16585: Fix NPE in MatchAllDocs pagination

GitBox Tue, 13 Dec 2022 05:44:51 -0800


magibney commented on code in PR #1236:
URL: https://github.com/apache/solr/pull/1236#discussion_r1047173635



##########
solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java:
##########
@@ -2203,15 +2203,19 @@ public DocListAndSet getDocListAndSet(Query query, Sort 
lsort, int offset, int l
 
   private DocList constantScoreDocList(int offset, int length, DocSet docs) {
     final int size = docs.size();
-    if (length == 0 || size <= offset) {
-      return new DocSlice(0, 0, new int[0], null, size, 0f, 
TotalHits.Relation.EQUAL_TO);
-    }
-    final DocIterator iter = docs.iterator();
-    for (int i = offset; i > 0; i--) {
-      iter.nextDoc(); // discard
-    }
-    final int returnSize = Math.min(length, size - offset);
+
+    // NOTE: it would be possible to special-case `length == 0 || size <= 
offset` here
+    // (returning a DocList backed by an empty array) -- but the cases that 
would practically
+    // benefit from doing so would be extremely unusual, and likely 
pathological:
+    //   1. length==0 in conjunction with offset>0 (why?)
+    //   2. specifying offset>size (paging beyond end of results)
+    // This would require special consideration in dealing with cache handling 
(and generation
+    // of the final DocList via `DocSlice.subset(int, int)`), and it's just 
not worth it.
+
+    final int returnSize = Math.min(offset + length, size);

Review Comment:
   That's an indirect consequence of the essence of the bug. In SOLR-14765 I 
incorrectly assumed it was ok to trim the result list at the point of initial 
DocSlice creation -- i.e., if you have offset=200, len=10, that it should be 
possible to return a DocSlice backed by a 10-element int[]. But in fact there's 
an assumption, in all cases, that index 0 of the backing docs int[] is the top 
doc (and consequently we always need to populate the array out as far as 
docs[offset+length].
   
   So at this stage, DocSlice is initialized with an `offset=0` ctor arg, where 
offset reflects both the offset into the array _and_ the offset of the first 
element of the array . The _actual_ returned docList is created subsequently by 
calling initialDocList.subset(actualRequestedOffset, actualRequestedLength). In 
fact, in the codebase the initial DocSlice is almost always created with 
offset=0 -- I toyed with the idea of actually enforcing this, making a public 
ctor that always sets offset=0 and length=docs.length, and only setting 
explicit offset/len via a private ctor (invoked by DocSlice.subset(int, int)). 
But it became more complex than I'd hoped, so I figured to keep this change as 
small as possible.
   
   When initial DocSlices are created in this way, it simplifies the logic 
about how they can be cached and repurposed to serve other ranges of requested 
docs. The special-casing of length=0 I think would also have messed with this, 
hence the removal of the special-casing. Worst-case consequence of removing the 
special-casing is that if somebody requests offset>0, length=0, we now have to 
build and populate an array of `offset` elements, or if someone requests 
offset>docSet.size(), length>0, we now have to build and populate an array of 
`docSet.size()` elements. Neither of these would actually be used to satisfy 
the immediate request, but both cases are weird/pathological edge cases so 
probably not worth special-casing, given the hoops we'd have to jump through, 
e.g. to avoid caching them, etc.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [solr] magibney commented on a diff in pull request #1236: SOLR-16585: Fix NPE in MatchAllDocs pagination

Reply via email to