Toke, thanks! We will look at this solution, looks like this is that what we need.
> On 12 нояб. 2015 г., at 20:42, Toke Eskildsen <t...@statsbiblioteket.dk> > wrote: > > Valentin Popov <valentin...@gmail.com> wrote: > >> We have ~10 indexes for 500M documents, each document >> has «archive date», and «to» address, one of our task is >> calculate statistics of «to» for last year. Right now we are >> using search archive_date:(current_date - 1 year) and paginate >> results for 50k records for page. Bottleneck of that approach, >> pagination take too long time and on powerful server it take >> ~20 days to execute, and it is very long. > > Lucene does not like deep page requests due to the way the internal Priority > Queue works. Solr has CursorMark, which should be fairly simple to emulate in > your Lucene handling code: > > http://lucidworks.com/blog/2013/12/12/coming-soon-to-solr-efficient-cursor-based-iteration-of-large-result-sets/ > > - Toke Eskildsen > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > Regards, Valentin Popov --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org