[jira] [Resolved] (LUCENE-1532) File based spellcheck with doc frequencies supplied

Erick Erickson (JIRA) Sat, 16 Mar 2013 11:40:12 -0700

     [ 
https://issues.apache.org/jira/browse/LUCENE-1532?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Erick Erickson resolved LUCENE-1532.
------------------------------------

    Resolution: Won't Fix

SPRING_CLEANING_2013 we can reopen if necessary. 
                
> File based spellcheck with doc frequencies supplied
> ---------------------------------------------------
>
>                 Key: LUCENE-1532
>                 URL: https://issues.apache.org/jira/browse/LUCENE-1532
>             Project: Lucene - Core
>          Issue Type: New Feature
>          Components: modules/spellchecker
>            Reporter: David Bowen
>            Priority: Minor
>
> The file-based spellchecker treats all words in the dictionary as equally 
> valid, so it can suggest a very obscure word rather than a more common word 
> which is equally close to the misspelled word that was entered.  It would be 
> very useful to have the option of supplying an integer with each word which 
> indicates its commonness.  I.e. the integer could be the document frequency 
> in some index or set of indexes.
> I've implemented a modification to the spellcheck API to support this by 
> defining a DocFrequencyInfo interface for obtaining the doc frequency of a 
> word, and a class which implements the interface by looking up the frequency 
> in an index.  So Lucene users can provide alternative implementations of 
> DocFrequencyInfo.  I could submit this as a patch if there is interest.  
> Alternatively, it might be better to just extend the spellcheck API to have a 
> way to supply the frequencies when you create a PlainTextDictionary, but that 
> would mean storing the frequencies somewhere when building the spellcheck 
> index, and I'm not sure how best to do that.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Resolved] (LUCENE-1532) File based spellcheck with doc frequencies supplied

Reply via email to