Hi, > > Hi, > > The big question is: Do you need the results paged at all?
Yup, because if we return all results, we get OME. > Do you need them sorted? Nope. > If not, the easiest approach is to use a custom Collector that does no > sorting and just consumes the results. Main bottleneck as I see come from next page search, that took ~2-4 seconds. > > Uwe > > ----- > Uwe Schindler > H.-H.-Meier-Allee 63, D-28213 Bremen > http://www.thetaphi.de > eMail: u...@thetaphi.de > >> -----Original Message----- >> From: Valentin Popov [mailto:valentin...@gmail.com] >> Sent: Thursday, November 12, 2015 6:48 PM >> To: java-user@lucene.apache.org >> Subject: Re: 500 millions document for loop. >> >> Toke, thanks! >> >> We will look at this solution, looks like this is that what we need. >> >> >>> On 12 нояб. 2015 г., at 20:42, Toke Eskildsen <t...@statsbiblioteket.dk> >> wrote: >>> >>> Valentin Popov <valentin...@gmail.com> wrote: >>> >>>> We have ~10 indexes for 500M documents, each document >>>> has «archive date», and «to» address, one of our task is >>>> calculate statistics of «to» for last year. Right now we are >>>> using search archive_date:(current_date - 1 year) and paginate >>>> results for 50k records for page. Bottleneck of that approach, >>>> pagination take too long time and on powerful server it take >>>> ~20 days to execute, and it is very long. >>> >>> Lucene does not like deep page requests due to the way the internal >> Priority Queue works. Solr has CursorMark, which should be fairly simple to >> emulate in your Lucene handling code: >>> >>> http://lucidworks.com/blog/2013/12/12/coming-soon-to-solr-efficient- >> cursor-based-iteration-of-large-result-sets/ >>> >>> - Toke Eskildsen >>> >>> --------------------------------------------------------------------- >>> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >>> For additional commands, e-mail: java-user-h...@lucene.apache.org >>> >> >> Regards, >> Valentin Popov >> >> >> >> >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > С Уважением, Валентин Попов --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org