Re: [I] [MINDEXER-14] FlatSearchResponse.totalHits = 1000 when there are in fact more [maven-indexer]

via GitHub Wed, 11 Jun 2025 22:51:50 -0700


jira-importer commented on issue #444:
URL: https://github.com/apache/maven-indexer/issues/444#issuecomment-2965141036


   **[Tamas 
Cservenak](https://issues.apache.org/jira/secure/ViewProfile.jspa?name=cstamas)**
 commented
   
   Agreed with proposal, but a bit of explanation: the (old) "flat" and 
"grouped" search results are implemented in a bit naive way, they store all the 
elements (hits) in memory, hence on large sets, they would "eat up" large 
memory amount (and probably OOM).
   
   This is why the IteratorSearchRequest/IteratorSearchResponse was introduced, 
it was relying on Lucene being smart, fetching Lucene documents "as needed" 
(while iterating over result) and not keeping more then few ArtifactInfo 
instances in memory. This not just lessens memory consumption, but lesses IO 
too (disk bashing), that happens _after_ lucene search was returned, when 
sequential hit fetches and ArtifactInfo record construction happens (with 
non-iterator searches).
   
   But alas, reviewing the code shows that IteratorSearches probably suffer 
from same problem as flat and grouped searchs: lucene result is limited to "top 
1000" it seems.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] [MINDEXER-14] FlatSearchResponse.totalHits = 1000 when there are in fact more [maven-indexer]

Reply via email to