[jira] Updated: (SOLR-572) Spell Checker as a Search Component

Shalin Shekhar Mangar (JIRA) Thu, 22 May 2008 14:30:20 -0700

     [ 
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Shalin Shekhar Mangar updated SOLR-572:
---------------------------------------

    Attachment: SOLR-572.patch

This patch contains the following changes:

# Fixes bug reported by Oleg -- Thanks to Bojan for this.
# thresholdTokenFrequency can be used to tweak the frequency of tokens being 
passed to spell check index. This is applied only for index type dictionaries.
# Moved getLines as an overloaded method to SolrResourceLoader.
# To avoid having a dependency to Lucene 2.4 (trunk) code, I created a wrapper 
class for PlainTextDictionary which calls it's protected constructor 
PlainTextDictionary(Reader)
# Uses Lucene's SpellChecker's overloaded suggestSimilar method which accepts 
the IndexReader as a param. This makes sure that when the query is present in 
the index, a different suggestion is not returned.
# Implements the onlyMorePopular *only* for dictionaries built from Solr fields
# Implements the extendedResults *only* for dictionaries built from Solr fields 
and only when spellcheck.count is greater than 1
# No need to specify spellcheck.dictionary as a request parameter if only one 
dictionary is configured.
# Accuracy is configurable through solrconfig.xml

Still to do:
# It is possible to implement onlyMorePopular and extendedResults for 
dictionaries created from arbitary lucene indices too but I haven't looked into 
that yet.
# Tests are missing
# Add command to reload dictionaries

> Spell Checker as a Search Component
> -----------------------------------
>
>                 Key: SOLR-572
>                 URL: https://issues.apache.org/jira/browse/SOLR-572
>             Project: Solr
>          Issue Type: New Feature
>          Components: spellchecker
>    Affects Versions: 1.3
>            Reporter: Shalin Shekhar Mangar
>            Assignee: Grant Ingersoll
>             Fix For: 1.3
>
>         Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
> SOLR-572.patch, SOLR-572.patch
>
>
> Expose the Lucene contrib SpellChecker as a Search Component. Provide the 
> following features:
> * Allow creating a spell index on a given field and make it possible to have 
> multiple spell indices -- one for each field
> * Give suggestions on a per-field basis
> * Given a multi-word query, give only one consistent suggestion
> * Process the query with the same analyzer specified for the source field and 
> process each token separately
> * Allow the user to specify minimum length for a token (optional)
> Consistency criteria for a multi-word query can consist of the following:
> * Preserve the correct words in the original query as it is
> * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Updated: (SOLR-572) Spell Checker as a Search Component

Reply via email to