[ 
https://issues.apache.org/jira/browse/SOLR-2761?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13105400#comment-13105400
 ] 

David Smiley commented on SOLR-2761:
------------------------------------

LOL; I am not using it "seriously". I'm merely kicking the tires to see how 
well it works so I can write about it in the 2nd edition of my book.  When you 
say "Most people use Solr queries to do suggestions it seems", do you mean 
search query logs? That requires sufficient query volume, and it's more work to 
set up, for sure, than query term completion/suggest.

> FSTLookup should use long-tail like discretization instead of proportional 
> (linear)
> -----------------------------------------------------------------------------------
>
>                 Key: SOLR-2761
>                 URL: https://issues.apache.org/jira/browse/SOLR-2761
>             Project: Solr
>          Issue Type: Improvement
>          Components: spellchecker
>    Affects Versions: 3.4
>            Reporter: David Smiley
>            Assignee: Dawid Weiss
>            Priority: Minor
>
> The Suggester's FSTLookup implementation discretizes the term frequencies 
> into a configurable number of buckets (configurable as "weightBuckets") in 
> order to deal with FST limitations. The mapping of a source frequency into a 
> bucket is a proportional (i.e. linear) mapping from the minimum and maximum 
> value. I don't think this makes sense at all given the well-known long-tail 
> like distribution of term frequencies. As a result of this problem, I've 
> found it necessary to increase weightBuckets substantially, like >100, to get 
> quality suggestions. 

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to