[ https://issues.apache.org/jira/browse/SOLR-572?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12605645#action_12605645 ]
Grant Ingersoll commented on SOLR-572: -------------------------------------- {quote} Why is a WhiteSpaceTokenizer being used for tokenizing the value for a spellcheck.q parameter? Wouldn't it be more correct to use the query analyzer if the index is being built from a Solr field? The above argument also applies to queryAnalyzerFieldType which is being used for QueryConverter {quote} My understanding was that the sc.q parameter was already analyzed and ready to be checked, thus all it needed was a conversion to tokens. As for the queryAnalyzerFieldType, that assumes the implementation is the IndexBasedSpellChecker or some other field based one that the SpellCheckComponent doesn't have access to, thus my reasoning that it needs to be handled separately and explicitly, which is why it isn't a part of the spellchecker configuration. {quote} I see that we can specify our own query converter through the queryConverter section in solrconfig.xml. But the SpellCheckComponent uses SpellingQueryConverter directly instead of an interface. We should add a QueryConvertor interface if this needs to be pluggable. {quote} I thought about making it an abstract base class, but in my mind it is really easy to override the SpellingQueryConverter and the component should know how to deal with it. {quote} If name is omitted from two dictionaries in solrconfig.xml then both get named as Default from the SolrSpellChecker#init method and they overwrite each other in the spellCheckers map {quote} Hmm, not good. I will fix. {quote} How about building the index in the inform() method? I understand that the users can build the index using spellcheck.build=true and they can also use QuerySenderListener to build the index but this limits the user to use FSDirectory because if we use RAMDirectory and solr is restarted, the QuerySenderListener never fires and spell checker is left with no index. It's not a major inconvenience to use FSDirectory always but then RAMDirectory doesn't bring much to the table. {quote} I think this gets back to our early discussions about it not working in inform b/c we don't have the reader at that point, or something like that. I really don't know the right answer, but do feel free to try it out. I do think it belongs in inform, but not sure if Solr is ready at that point. As for the QuerySenderListener, seems like it should fire if it is restarted, but I admit I don't know a whole lot about that functionality. > Spell Checker as a Search Component > ----------------------------------- > > Key: SOLR-572 > URL: https://issues.apache.org/jira/browse/SOLR-572 > Project: Solr > Issue Type: New Feature > Components: spellchecker > Affects Versions: 1.3 > Reporter: Shalin Shekhar Mangar > Assignee: Grant Ingersoll > Priority: Minor > Fix For: 1.3 > > Attachments: SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, > SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, > SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, > SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, > SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, > SOLR-572.patch, SOLR-572.patch, SOLR-572.patch, SOLR-572.patch > > > Expose the Lucene contrib SpellChecker as a Search Component. Provide the > following features: > * Allow creating a spell index on a given field and make it possible to have > multiple spell indices -- one for each field > * Give suggestions on a per-field basis > * Given a multi-word query, give only one consistent suggestion > * Process the query with the same analyzer specified for the source field and > process each token separately > * Allow the user to specify minimum length for a token (optional) > Consistency criteria for a multi-word query can consist of the following: > * Preserve the correct words in the original query as it is > * Never give duplicate words in a suggestion -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.