[
https://issues.apache.org/jira/browse/SOLR-3240?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13228469#comment-13228469
]
Robert Muir commented on SOLR-3240:
-----------------------------------
Yes, but I'm saying that we can also still approximate the hit count.
for example, for n=1, if you have 20,000 docs, and the first docid is '100', we
estimate there are 200 matching docs.
you can increase n (max # of collected docs), to increase the accuracy at the
cost of performance.
currently n=infinity and its always exact :)
James can you tell me how collation.hits is used? Does collation use this
directly as a heuristic for re-ranking suggestions?
Or is it only metadata supplied to the user.
The idea here is that exact numbers are probably not needed for most use cases:
they would probably rather have
inexact hit counts but faster performance.
> add spellcheck 'approximate collation count' mode
> -------------------------------------------------
>
> Key: SOLR-3240
> URL: https://issues.apache.org/jira/browse/SOLR-3240
> Project: Solr
> Issue Type: Improvement
> Components: spellchecker
> Reporter: Robert Muir
>
> SpellCheck's Collation in Solr is a way to ensure spellcheck/suggestions
> will actually net results (taking into account context like filtering).
> In order to do this (from my understanding), it generates candidate queries,
> executes them, and saves the total hit count: collation.setHits(hits).
> For a large index it seems this might be doing too much work: in particular
> I'm interested in ensuring this feature can work fast enough/well for
> autosuggesters.
> So I think we should offer an 'approximate' mode that uses an
> early-terminating
> Collector, collect()ing only N docs (e.g. n=1), and we approximate this result
> count based on docid space.
> I'm not sure what needs to happen on the solr side (possibly support for
> custom collectors?),
> but I think this could help and should possibly be the default.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]