[jira] Commented: (SOLR-81) Add Query Spellchecker functionality

Hoss Man (JIRA) Wed, 07 Mar 2007 13:02:46 -0800

    [ 
https://issues.apache.org/jira/browse/SOLR-81?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12478893
 ]


Hoss Man commented on SOLR-81:
------------------------------

looking over both Otis's patches and Adam's patches for hte first time i find 
myself really confused.

As previously discussed in email, there are two completley different appraoches 
that could be taken to achieve "spell correction" using Solr:

1) Use something like the Lucene SpellChecker contrib to make suggestions 
basedon the data in the main solr index (defined by the solr schema) ... adding 
hooks to Solr to keep the SpellChecker system aware of changes to the main 
index, and hooks to allow requesthandlers to return suggestions with each query

2) use the main solr index (defined by the schema) to store the dictionary of 
words, turning the entire solr instance into one giant SpellChecker.  In this 
case there would be a recomended schema.xml for users who want to setup a 
SpellChecker Solr instance and possible a custom RequestHandler htat assumes 
you are using this schema.


These two patches both seem to be dealing with case#1, but they have hints of 
approach#2 ... for example i don't entirely understand why they include the 
NGram tokenfilter factories, since they don't seem to need the fields of the 
solr index to be tokenized in any special way (since the lucene SpellChecker 
controls the format of it's dictionary).   It's also not clear do me what the 
purpose of the SpellCheckerRequestHandler is ... if the main index is storing 
"real" user records, then wouldn't a helper method that existing request 
handlers (like dismax and standard) can optionally call to get the SpellChecker 
data be more useful?

> Add Query Spellchecker functionality
> ------------------------------------
>
>                 Key: SOLR-81
>                 URL: https://issues.apache.org/jira/browse/SOLR-81
>             Project: Solr
>          Issue Type: New Feature
>          Components: search
>            Reporter: Otis Gospodnetic
>            Priority: Minor
>         Attachments: SOLR-81-edgengram-ngram.patch, 
> SOLR-81-ngram-schema.patch, SOLR-81-ngram.patch, SOLR-81-ngram.patch, 
> SOLR-81-ngram.patch, SOLR-81-ngram.patch, SOLR-81-spellchecker.patch, 
> SOLR-81-spellchecker.patch
>
>
> Use the simple approach of n-gramming outside of Solr and indexing n-gram 
> documents.  For example:
> <doc>
> <field name="word">lettuce</field>
> <field name="start3">let</field>
> <field name="gram3">let ett ttu tuc uce</field>
> <field name="end3">uce</field>
> <field name="start4">lett</field>
> <field name="gram4">lett ettu ttuc tuce</field>
> <field name="end4">tuce</field>
> </doc>
> See:
> http://www.mail-archive.com/[email protected]/msg01254.html
> Java clients: SOLR-20 (add delete commit optimize), SOLR-30 (search)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (SOLR-81) Add Query Spellchecker functionality

Reply via email to