Hi, The big question is: Do you need the results paged at all? Do you need them sorted? If not, the easiest approach is to use a custom Collector that does no sorting and just consumes the results.
Uwe ----- Uwe Schindler H.-H.-Meier-Allee 63, D-28213 Bremen http://www.thetaphi.de eMail: u...@thetaphi.de > -----Original Message----- > From: Valentin Popov [mailto:valentin...@gmail.com] > Sent: Thursday, November 12, 2015 6:48 PM > To: java-user@lucene.apache.org > Subject: Re: 500 millions document for loop. > > Toke, thanks! > > We will look at this solution, looks like this is that what we need. > > > > On 12 нояб. 2015 г., at 20:42, Toke Eskildsen <t...@statsbiblioteket.dk> > wrote: > > > > Valentin Popov <valentin...@gmail.com> wrote: > > > >> We have ~10 indexes for 500M documents, each document > >> has «archive date», and «to» address, one of our task is > >> calculate statistics of «to» for last year. Right now we are > >> using search archive_date:(current_date - 1 year) and paginate > >> results for 50k records for page. Bottleneck of that approach, > >> pagination take too long time and on powerful server it take > >> ~20 days to execute, and it is very long. > > > > Lucene does not like deep page requests due to the way the internal > Priority Queue works. Solr has CursorMark, which should be fairly simple to > emulate in your Lucene handling code: > > > > http://lucidworks.com/blog/2013/12/12/coming-soon-to-solr-efficient- > cursor-based-iteration-of-large-result-sets/ > > > > - Toke Eskildsen > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > > For additional commands, e-mail: java-user-h...@lucene.apache.org > > > > Regards, > Valentin Popov > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org