Yep, I can confirm this kind of approach works. If you truly have massive amounts of data and they are in memory, you might also want to consider making the documents a monolithic byte/char array and have a computed index (ie a roll ya own array). This was you avoid a lot of GC work in high transaction systems. Make the byte[] a static final and that memory should never get looked at by a GC.
Winton >Doug, > >Thank you very much for you detailed response. I'll give your suggestions a >try. I'm *very* impressed so far with lucene. Performance is terrific. > >Kind Regards, > >Chris Opler > >Doug Cutting wrote: > >> > From: Chris Opler [mailto:[EMAIL PROTECTED]] >> > >> > Am wondering if there is any facility to sort search hits by >> > fields in the >> > Document. >> >> No, there's nothing like this built in to Lucene. >> >> This can be very expensive with large collections, since it requires reading >> a Document object for every hit. Reading a Document requires a >> random-access disk read. And when someone includes a common word in a >> query, there can be lots of hits, far more than will ever be viewed by the >> user. >> >> An exception is date sorting, which can be easily implemented using a >> HitCollector. Documents are delivered to a hit collector in the order they >> were added to the index, so returning the oldest or most recent hits can be >> done without reading field values. This is discussed more in: >> http://www.mail-archive.com/[email protected]/msg00228.html >> Someday this will be built into Lucene... >> >> To implement efficient field sorting for a large collection you could >> construct a fast index of a field (e.g., an in-memory array) and then >> implement a HitCollector which uses this. For example, you could construct >> an array of floats for a "price" field. Then your hit collector could do >> something like: >> class MyCollector implements HitCollector { >> private float maxPrice = Float.MAX_VALUE; >> public final void collect(int doc, float score) { >> float price = prices[doc]; >> if (price <= maxPrice) { >> hits.add(price, doc); >> if (hits.size() > maxHitCount) { >> hits.remove(hits.get(maxPrice)); >> maxPrice = hits.lastKey(); >> } >> } >> } >> } >> >> Also, if your collection is small, you can probably afford to simply >> enumerate all hit documents and sort them as you wish. >> >> Doug >> >> -- >> To unsubscribe, e-mail: >><mailto:[EMAIL PROTECTED]> >> For additional commands, e-mail: >><mailto:[EMAIL PROTECTED]> > >-- >======================= >http://www.openwine.org > > > >-- >To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> >For additional commands, e-mail: <mailto:[EMAIL PROTECTED]> -- Winton Davies Lead Engineer, Overture (NSDQ: OVER) 1820 Gateway Drive, Suite 360 San Mateo, CA 94404 work: (650) 403-2259 cell: (650) 867-1598 http://www.overture.com/ -- To unsubscribe, e-mail: <mailto:[EMAIL PROTECTED]> For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>
