[ https://issues.apache.org/jira/browse/SOLR-3240?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
James Dyer updated SOLR-3240: ----------------------------- Attachment: SOLR-3240.patch Ok. I think I have a version here that will never compute scores, without having to write a lot of special code for it. Best I can tell, when "collateMaxCollectDocs" is 0 or not specified, it will use the first inner-class Collector in SolrIndexSearcher#getDocListNC (this one is almost identical to TotalHitCountCollector). Otherwise, it will use OneComparatorNonScoringCollector with the sort being on "<id>". These queries will also make use of the Solr filter cache & query result cache when they can, etc. The one thing is that the unit tests make it easy to determine if it is giving the estimate you'd expect, etc. What I can't so easily test is if I turn off hit reporting entirely (collateExtendedResults=false), is it still picking a non-scoring collector. I would like to add a test that does this but not so sure what the least-invasive approach would be. I'm also thinking I can safely get rid of the "forceInorderCollection" flag because requesting docs sorted by doc-id would enforce the same thing, right? > add spellcheck 'approximate collation count' mode > ------------------------------------------------- > > Key: SOLR-3240 > URL: https://issues.apache.org/jira/browse/SOLR-3240 > Project: Solr > Issue Type: Improvement > Components: spellchecker > Reporter: Robert Muir > Attachments: SOLR-3240.patch, SOLR-3240.patch > > > SpellCheck's Collation in Solr is a way to ensure spellcheck/suggestions > will actually net results (taking into account context like filtering). > In order to do this (from my understanding), it generates candidate queries, > executes them, and saves the total hit count: collation.setHits(hits). > For a large index it seems this might be doing too much work: in particular > I'm interested in ensuring this feature can work fast enough/well for > autosuggesters. > So I think we should offer an 'approximate' mode that uses an > early-terminating > Collector, collect()ing only N docs (e.g. n=1), and we approximate this result > count based on docid space. > I'm not sure what needs to happen on the solr side (possibly support for > custom collectors?), > but I think this could help and should possibly be the default. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators: https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa For more information on JIRA, see: http://www.atlassian.com/software/jira --------------------------------------------------------------------- To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org For additional commands, e-mail: dev-h...@lucene.apache.org