Doug Cutting wrote:
Andrzej Bialecki wrote:
Ok, I just tested IndexSorter for now. It appears to work correctly,
at least I get exactly the same results, with the same scores and the
same explanations, if I run the smae queries on the original and on
the sorted index.
Here's a more complete version, still mostly untested. This should
make searches faster. We'll see how much good the results are...
This includes a patch to Lucene to make it easier to write hit
collectors that collect TopDocs.
I'll test this on a 38M document index tomorrow.
I'll test it soon - one comment, though. Currently you use a subclass of
RuntimeException to stop the collecting. I think we should come up with
a better mechanism - throwing exceptions is too costly. Perhaps the
HitCollector.collect() method should return a boolean to signal whether
the searcher should continue working.
--
Best regards,
Andrzej Bialecki <><
___. ___ ___ ___ _ _ __________________________________
[__ || __|__/|__||\/| Information Retrieval, Semantic Web
___|||__|| \| || | Embedded Unix, System Integration
http://www.sigram.com Contact: info at sigram dot com