Toke, I just look throw code we already using such method IndexSearcher indexSearcher = getIndexSearcher(searchResult); TopDocs topDocs; ScoreDoc currectScoreDoc = p.startScoreDoc; for (int page = 1; page < pages - 1; page++) { topDocs = indexSearcher.searchAfter(currectScoreDoc, query, queryFilter, searchResult.getPageSize(), sort); int endpos = topDocs.scoreDocs.length - 1; if (endpos > 0) { startIdx += topDocs.scoreDocs.length; currectScoreDoc = topDocs.scoreDocs[endpos]; searchResult.setPage(currectScoreDoc, startIdx); } topDocs = null; if (searchResult.getCancelled()) { return searchResult; } }
> On 12 нояб. 2015 г., at 20:42, Toke Eskildsen <t...@statsbiblioteket.dk> > wrote: > > Valentin Popov <valentin...@gmail.com> wrote: > >> We have ~10 indexes for 500M documents, each document >> has «archive date», and «to» address, one of our task is >> calculate statistics of «to» for last year. Right now we are >> using search archive_date:(current_date - 1 year) and paginate >> results for 50k records for page. Bottleneck of that approach, >> pagination take too long time and on powerful server it take >> ~20 days to execute, and it is very long. > > Lucene does not like deep page requests due to the way the internal Priority > Queue works. Solr has CursorMark, which should be fairly simple to emulate in > your Lucene handling code: > > http://lucidworks.com/blog/2013/12/12/coming-soon-to-solr-efficient-cursor-based-iteration-of-large-result-sets/ > > - Toke Eskildsen > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > Regards, Valentin Popov --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org