1 billion i.e. 1,000,000,000? Either buy more RAM, lots more RAM, or skip lucene sorting and do your own sorting for the top n hits. You might also want to look into sharding/distributing your index.
-- Ian. On Wed, Aug 25, 2010 at 6:16 AM, Shelly_Singh <shelly_si...@infosys.com> wrote: > I have 1 bln documents to sort. So, that would mean ( 8 bln bytes == 8GB RAM) > bytes. > All I have is 8 GB on my machine, so I do not think approach would work. > > Any other options? > > -----Original Message----- > From: Erick Erickson [mailto:erickerick...@gmail.com] > Sent: Thursday, August 19, 2010 7:18 PM > To: java-user@lucene.apache.org > Subject: Re: Sorting a Lucene index > > You haven't yet told us how many documents you're talking about here, so > it's > hard to have a good idea of what solutions are. That said, I'd just try > sorting first. > The sorting cache size will be something like (sizeof(int or long)) * > (number of documents). > Measure (remember to measure the response after query warmups, the first > few will be slower because they fill up the cache) THEN fix iff there's a > problem. > > And just forget the idea of inserting your documents in the correct order > <G>. You > stated that the documents come in in random order. Document IDs are assigned > internally to Lucene, and monotonically increasing. I sure don't see how you > can > reconcile those two things... > > But again, just try it with sorting on the numeric field and only fix things > if you have a > problem. Lots of work has been put into making Lucene fast, by very bright > people. See > if they've already solved your problem for you... > > Best > Erick. > > > On Thu, Aug 19, 2010 at 1:51 AM, Shelly_Singh <shelly_si...@infosys.com>wrote: > >> Hi Anshum, >> >> I require sorted results for all my queries and the field on which I need >> sorting is fixed; so this lead to me the idea of storing in sorted order to >> avoid sorting cost with every query. >> >> Thanks and Regards, >> >> Shelly Singh >> Center For KNowledge Driven Information Systems, Infosys >> Email: shelly_si...@infosys.com >> Phone: (M) 91 992 369 7200, (VoIP)2022978622 >> >> -----Original Message----- >> From: Anshum [mailto:ansh...@gmail.com] >> Sent: Wednesday, August 18, 2010 5:21 PM >> To: java-user@lucene.apache.org >> Subject: Re: Sorting a Lucene index >> >> Hi Shelly, >> The search results so returned are sorted either by relevance, index order, >> stored field, or custom order. >> As you are saying that you would not be able to maintain the index order, >> you would have to do the sort at run time. >> Sorting on a stored field is not costly and you may use it comfortably. >> btw, >> are you facing any issues in sort time or is it a presumption? >> >> -- >> Anshum Gupta >> http://ai-cafe.blogspot.com >> >> >> On Wed, Aug 18, 2010 at 5:12 PM, Shelly_Singh <shelly_si...@infosys.com >> >wrote: >> >> > Hi, >> > >> > I have a Lucene index that contains a numeric field along with certain >> > other fields. The order of incoming documents is random and >> un-predictable. >> > As a result, while creating an index, I end up adding docs in random >> order >> > with respect to the numeric field value. >> > >> > For example, documents may be added in following order: >> > 12,y,d >> > 100,o,p >> > 1,x,y >> > 23,u,i >> > 31,v,m >> > 22,b,m >> > 109,k,l >> > >> > My requirement is that at search time, I want the documents in order of >> the >> > numeric field. >> > One, option is to do a score/sort on the numeric field. >> > But, this may be a costly operation. >> > >> > Hence, I am trying to find if there is some way, such that, my stored >> index >> > is sorted by itself. >> > >> > Please help. >> > >> > Thanks and Regards, >> > >> > Shelly Singh >> > Center For KNowledge Driven Information Systems, Infosys >> > Email: shelly_si...@infosys.com<mailto:shelly_si...@infosys.com> >> > Phone: (M) 91 992 369 7200, (VoIP)2022978622 >> > >> > >> > >> > >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org >> For additional commands, e-mail: java-user-h...@lucene.apache.org >> >> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org