Doug Cutting wrote:

Andrzej Bialecki wrote:

Ok, I just tested IndexSorter for now. It appears to work correctly, at least I get exactly the same results, with the same scores and the same explanations, if I run the smae queries on the original and on the sorted index.


Here's a more complete version, still mostly untested. This should make searches faster. We'll see how much good the results are...

This includes a patch to Lucene to make it easier to write hit collectors that collect TopDocs.

I'll test this on a 38M document index tomorrow.


I'll test it soon - one comment, though. Currently you use a subclass of RuntimeException to stop the collecting. I think we should come up with a better mechanism - throwing exceptions is too costly. Perhaps the HitCollector.collect() method should return a boolean to signal whether the searcher should continue working.

--
Best regards,
Andrzej Bialecki     <><
___. ___ ___ ___ _ _   __________________________________
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com




-------------------------------------------------------
This SF.net email is sponsored by: Splunk Inc. Do you grep through log files
for problems?  Stop!  Download the new AJAX search engine that makes
searching your log files as easy as surfing the  web.  DOWNLOAD SPLUNK!
http://ads.osdn.com/?ad_id=7637&alloc_id=16865&op=click
_______________________________________________
Nutch-developers mailing list
Nutch-developers@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/nutch-developers

Reply via email to