On Dec 14, 2009, at 6:00 PM, Niclas Rothman wrote: > Hi there, > Perhaps this is far out but I need to get some advice on the following > problem. > > We use Lucene what it is really good to, to find documents by "relevance". > After a search have been done and I have the hits in my hands, I need to do > some heavy sorting on this list where the data about sorting is stored in the > database, not in the lucene index. > Therefore I need to get all document ids for a search so I can fetch the > needed data from the database and afterwards apply my custom sorting. > > How can I get from a search all document ids? > Can this be done with ok performance? > > I have been wondering if could do the sorting in lucene but I don't feel > comfortable at all because of lacking information / documentation. > Also, the sorting should preferable be don Just in time, that is, the > underlying data for sorting changes constantly and I cant reindex as soon as > sorting data changes. > > Any idea / suggestions?
I would look at implementing a custom comparator for the Sort instance in Lucene. This requires implementing a FieldComparatorSource and a FieldComparator. There are lots of examples in the Lucene code of this. Note, the name FieldComparatorSource is a bit of a misnomer, as it doesn't have to be a Field (for instance, on SOLR-1297, I just implemented it to allow for sorts by Function Queries). Naturally, getting this to perform with a database is going to be pretty tricky, but I think it will be way better than having to process all of the results a second time. Having an effective caching strategy (similar to Lucene's FieldCache) will be important. The other thing you could think about doing is loading a FieldCache with the ids (do it once when you load the IndexReader) and then use that with a bit set telling you what documents matched. In either case, you are making a tradeoff with memory. -Grant -------------------------- Grant Ingersoll http://www.lucidimagination.com/ Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids) using Solr/Lucene: http://www.lucidimagination.com/search
