[ 
https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12598752#action_12598752
 ] 

Bojan Smid commented on SOLR-572:
---------------------------------

I noticed that when searching for suggestion for a word which exists in 
dictionary, SC returns some similar word instead of returning that same word. 
Old SCRH had field "exist" which returned true if word exists in the dictionary 
(so the client can treat it as correct word that doesn't need suggestion). 

We can't have exactly the same functionality here (since "multi-word" queries 
should be supported), but we can make SC return field "spellingCorrect" in case 
all words from the query exist in the dictionary. Otherwise, there is no way to 
know if spelling was correct or we should display suggestion.

There is a method in Lucene's SC to check if word exists in the index, so it's 
easy to check if word is correct. However, I'm also thinking of situation when 
we don't have just simple words in the query, for instance : "toyata AND 
miles:[1 to 10000]", we want to check just toyata in the index, and return 
suggestion "toyota AND miles:[1 to 10000]". Other query types which might pose 
a problem are:
- fuzzy query
- wildcard query
- prefix query
...

> Spell Checker as a Search Component
> -----------------------------------
>
>                 Key: SOLR-572
>                 URL: https://issues.apache.org/jira/browse/SOLR-572
>             Project: Solr
>          Issue Type: New Feature
>          Components: spellchecker
>    Affects Versions: 1.3
>            Reporter: Shalin Shekhar Mangar
>             Fix For: 1.3
>
>         Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, 
> SOLR-572.patch
>
>
> Expose the Lucene contrib SpellChecker as a Search Component. Provide the 
> following features:
> * Allow creating a spell index on a given field and make it possible to have 
> multiple spell indices -- one for each field
> * Give suggestions on a per-field basis
> * Given a multi-word query, give only one consistent suggestion
> * Process the query with the same analyzer specified for the source field and 
> process each token separately
> * Allow the user to specify minimum length for a token (optional)
> Consistency criteria for a multi-word query can consist of the following:
> * Preserve the correct words in the original query as it is
> * Never give duplicate words in a suggestion

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to