[ https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12600294#action_12600294 ]
Otis Gospodnetic commented on SOLR-572: --------------------------------------- Right, Google only shows you the final output, not what they do in the backend. But the fact that they italicize misspelled words tells us they have a mechanism that allows the front end to identify them. So I think our task here is to figure out the best/easiest way for the client to identify misspelled words and offer the alternative query to the end user. I think what I outlined above will do that for us: * output all words sequentially * mark the words that are misspelled - it may be best to return the original word plus corrected word: <word="london"/> <!-- unchanged --> <word="brigge">bridge</word> or maybe with offset info: <word="london" offset="0"/> <!-- unchanged --> <word="brigge" offset="6">bridge</word> It's also fine to (*also*) return the final corrected string that doesn't mark the corrected words in any way, and let the "lazy" clients just use that. Grant or Shalin, will either of you be adding this? > Spell Checker as a Search Component > ----------------------------------- > > Key: SOLR-572 > URL: https://issues.apache.org/jira/browse/SOLR-572 > Project: Solr > Issue Type: New Feature > Components: spellchecker > Affects Versions: 1.3 > Reporter: Shalin Shekhar Mangar > Assignee: Grant Ingersoll > Priority: Minor > Fix For: 1.3 > > Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, > SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, > SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch > > > Expose the Lucene contrib SpellChecker as a Search Component. Provide the > following features: > * Allow creating a spell index on a given field and make it possible to have > multiple spell indices -- one for each field > * Give suggestions on a per-field basis > * Given a multi-word query, give only one consistent suggestion > * Process the query with the same analyzer specified for the source field and > process each token separately > * Allow the user to specify minimum length for a token (optional) > Consistency criteria for a multi-word query can consist of the following: > * Preserve the correct words in the original query as it is > * Never give duplicate words in a suggestion -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.