RE: Solr Spellcheck suggestions only return from /select handler when returning search results
Hi James, hi list, I can confirm the existence of data that's within 1 Levenshtein step from ichtscheiben: { responseHeader: { status: 0, QTime: 0, params: { fl: name,spell, indent: true, q: name:Sichtscheiben, _: 1410423419758, wt: json, rows: 50 } }, response: { numFound: 6, start: 0, docs: [ { name: Sichtscheiben, spell: Sichtscheiben }, { name: Sichtscheiben, spell: Sichtscheiben }, { name: Sichtscheiben, spell: Sichtscheiben }, { name: Sichtscheiben, spell: Sichtscheiben }, { name: Sichtscheiben, spell: Sichtscheiben }, { name: Sichtscheiben, spell: Sichtscheiben } ] } } Multiple records exist that should match. The note for alternativeTermCount is appreciated. I've tried another term: Transport. I get suggestions when I use Transpor and Transpo, even Transpotr, but ransport doesn't yield any suggestions. Maybe it's a question of the beginning of a word and has not really anything to do with stemming. Am 10.09.2014 15:19 schrieb Dyer, James: Thomas, It looks like you've set things up correctly in that while the user is searching against a stemmed field (name), spellcheck is checking against a lightly-analyzed copy of it (spell). This is the right way to do it as spellcheck against stemmed forms is usually undesirable. But as you've experienced, you will sometimes get results (due to stemming) and also suggestions (because the spellechecker is looking at unstemmed forms). If you do not want spellcheck to return anything when you get results, you can set spellcheck.maxResultsForSuggest=0. Now keeping in mind we're comparing unstemmed forms, can you verify you indeed have something in your index that is within 2 edits of ichtscheiben ? My guess is you probably don't, which would be why you do not get spelling results in that case. Also, even if you do have something within 2 edits, if ichtscheiben occurs in your index, by default it won't try to correct it at all (even if the query returns nothing, maybe because of filters or other required terms on the query). In this case you need to set spellcheck.alternativeTermCount to a non-zero value (try maybe 5). See http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.alternativeTermCount [1] and following sections. James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: Thomas Michael Engelke [mailto:thomas.enge...@posteo.de] Sent: Wednesday, September 10, 2014 5:00 AM To: Solr user Subject: Solr Spellcheck suggestions only return from /select handler when returning search results Hi, I'm experimenting with the Spellcheck component and have therefor used the example configuration for spell checking to try things out. My solrconfig.xml looks like this: searchComponent name=spellcheck class=solr.SpellCheckComponent str name=queryAnalyzerFieldTypespell/str !-- Multiple Spell Checkers can be declared and used by this component -- !-- a spellchecker built from a field of the main index -- lst name=spellchecker str name=namedefault/str str name=fieldspell/str str name=classnamesolr.DirectSolrSpellChecker/str !-- the spellcheck distance measure used, the default is the internal levenshtein -- str name=distanceMeasureinternal/str !-- uncomment this to require suggestions to occur in 1% of the documents float name=thresholdTokenFrequency.01/float -- /lst !-- a spellchecker that can break or combine words. See /spell handler below for usage -- lst name=spellchecker str name=namewordbreak/str str name=classnamesolr.WordBreakSolrSpellChecker/str str name=fieldspell/str str name=combineWordstrue/str str name=breakWordstrue/str int name=maxChanges10/int /lst /searchComponent And I've added the spellcheck component to my /select request handler: requestHandler name=/select class=solr.SearchHandler ... arr name=last-components strspellcheck/str /arr /requestHandler I have built up the spellchecker source in the schema.xml from the name field: field name=spell type=spell indexed=true stored=true required=false multiValued=false/ copyField source=name dest=spell maxChars=3 / ... fieldType name=spell class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.StandardTokenizerFactory/ /analyzer analyzer type=query tokenizer class=solr.StandardTokenizerFactory/ /analyzer /fieldType As I'm querying the /select request handler, I should get spellcheck suggestions with my results. However, I rarely get a suggestion. Examples: query: Sichtscheibe, spellcheck suggestion: Sichtscheiben (works) query: Sichtscheib, spellcheck suggestion: Sichtscheiben (works) query: ichtscheiben, no spellcheck suggestions As far as I can identify, I only get suggestions when I get real search results. I get results for the first 2 examples, because the german StemFilterFactory translates Sichtscheibe and Sichtscheiben into Sichtscheib, so there are
RE: Solr Spellcheck suggestions only return from /select handler when returning search results
Thomas, Yes, you are right about the problem being with the beginning of the word needing correction. If you are using DirectSolrSpellChecker, you need to set the minPrefix parameter to 0. Otherwise the default (1) requires the first character to match for it to try and correct it. See http://lucene.apache.org/core/4_10_0/suggest/org/apache/lucene/search/spell/DirectSpellChecker.html#setMinPrefix%28int%29 James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: Thomas Michael Engelke [mailto:thomas.enge...@posteo.de] Sent: Thursday, September 11, 2014 3:46 AM To: solr-user@lucene.apache.org Subject: RE: Solr Spellcheck suggestions only return from /select handler when returning search results Hi James, hi list, I can confirm the existence of data that's within 1 Levenshtein step from ichtscheiben: { responseHeader: { status: 0, QTime: 0, params: { fl: name,spell, indent: true, q: name:Sichtscheiben, _: 1410423419758, wt: json, rows: 50 } }, response: { numFound: 6, start: 0, docs: [ { name: Sichtscheiben, spell: Sichtscheiben }, { name: Sichtscheiben, spell: Sichtscheiben }, { name: Sichtscheiben, spell: Sichtscheiben }, { name: Sichtscheiben, spell: Sichtscheiben }, { name: Sichtscheiben, spell: Sichtscheiben }, { name: Sichtscheiben, spell: Sichtscheiben } ] } } Multiple records exist that should match. The note for alternativeTermCount is appreciated. I've tried another term: Transport. I get suggestions when I use Transpor and Transpo, even Transpotr, but ransport doesn't yield any suggestions. Maybe it's a question of the beginning of a word and has not really anything to do with stemming. Am 10.09.2014 15:19 schrieb Dyer, James: Thomas, It looks like you've set things up correctly in that while the user is searching against a stemmed field (name), spellcheck is checking against a lightly-analyzed copy of it (spell). This is the right way to do it as spellcheck against stemmed forms is usually undesirable. But as you've experienced, you will sometimes get results (due to stemming) and also suggestions (because the spellechecker is looking at unstemmed forms). If you do not want spellcheck to return anything when you get results, you can set spellcheck.maxResultsForSuggest=0. Now keeping in mind we're comparing unstemmed forms, can you verify you indeed have something in your index that is within 2 edits of ichtscheiben ? My guess is you probably don't, which would be why you do not get spelling results in that case. Also, even if you do have something within 2 edits, if ichtscheiben occurs in your index, by default it won't try to correct it at all (even if the query returns nothing, maybe because of filters or other required terms on the query). In this case you need to set spellcheck.alternativeTermCount to a non-zero value (try maybe 5). See http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.alternativeTermCount [1] and following sections. James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: Thomas Michael Engelke [mailto:thomas.enge...@posteo.de] Sent: Wednesday, September 10, 2014 5:00 AM To: Solr user Subject: Solr Spellcheck suggestions only return from /select handler when returning search results Hi, I'm experimenting with the Spellcheck component and have therefor used the example configuration for spell checking to try things out. My solrconfig.xml looks like this: searchComponent name=spellcheck class=solr.SpellCheckComponent str name=queryAnalyzerFieldTypespell/str !-- Multiple Spell Checkers can be declared and used by this component -- !-- a spellchecker built from a field of the main index -- lst name=spellchecker str name=namedefault/str str name=fieldspell/str str name=classnamesolr.DirectSolrSpellChecker/str !-- the spellcheck distance measure used, the default is the internal levenshtein -- str name=distanceMeasureinternal/str !-- uncomment this to require suggestions to occur in 1% of the documents float name=thresholdTokenFrequency.01/float -- /lst !-- a spellchecker that can break or combine words. See /spell handler below for usage -- lst name=spellchecker str name=namewordbreak/str str name=classnamesolr.WordBreakSolrSpellChecker/str str name=fieldspell/str str name=combineWordstrue/str str name=breakWordstrue/str int name=maxChanges10/int /lst /searchComponent And I've added the spellcheck component to my /select request handler: requestHandler name=/select class=solr.SearchHandler ... arr name=last-components strspellcheck/str /arr /requestHandler I have built up the spellchecker source in the schema.xml from the name field: field name=spell type=spell indexed=true stored=true required=false multiValued=false/ copyField source=name dest=spell maxChars=3 / ... fieldType name=spell class=solr.TextField positionIncrementGap=100 analyzer
RE: Solr Spellcheck suggestions only return from /select handler when returning search results
Thomas, It looks like you've set things up correctly in that while the user is searching against a stemmed field (name), spellcheck is checking against a lightly-analyzed copy of it (spell). This is the right way to do it as spellcheck against stemmed forms is usually undesirable. But as you've experienced, you will sometimes get results (due to stemming) and also suggestions (because the spellechecker is looking at unstemmed forms). If you do not want spellcheck to return anything when you get results, you can set spellcheck.maxResultsForSuggest=0. Now keeping in mind we're comparing unstemmed forms, can you verify you indeed have something in your index that is within 2 edits of ichtscheiben ? My guess is you probably don't, which would be why you do not get spelling results in that case. Also, even if you do have something within 2 edits, if ichtscheiben occurs in your index, by default it won't try to correct it at all (even if the query returns nothing, maybe because of filters or other required terms on the query). In this case you need to set spellcheck.alternativeTermCount to a non-zero value (try maybe 5). See http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.alternativeTermCount and following sections. James Dyer Ingram Content Group (615) 213-4311 -Original Message- From: Thomas Michael Engelke [mailto:thomas.enge...@posteo.de] Sent: Wednesday, September 10, 2014 5:00 AM To: Solr user Subject: Solr Spellcheck suggestions only return from /select handler when returning search results Hi, I'm experimenting with the Spellcheck component and have therefor used the example configuration for spell checking to try things out. My solrconfig.xml looks like this: searchComponent name=spellcheck class=solr.SpellCheckComponent str name=queryAnalyzerFieldTypespell/str !-- Multiple Spell Checkers can be declared and used by this component -- !-- a spellchecker built from a field of the main index -- lst name=spellchecker str name=namedefault/str str name=fieldspell/str str name=classnamesolr.DirectSolrSpellChecker/str !-- the spellcheck distance measure used, the default is the internal levenshtein -- str name=distanceMeasureinternal/str !-- uncomment this to require suggestions to occur in 1% of the documents float name=thresholdTokenFrequency.01/float -- /lst !-- a spellchecker that can break or combine words. See /spell handler below for usage -- lst name=spellchecker str name=namewordbreak/str str name=classnamesolr.WordBreakSolrSpellChecker/str str name=fieldspell/str str name=combineWordstrue/str str name=breakWordstrue/str int name=maxChanges10/int /lst /searchComponent And I've added the spellcheck component to my /select request handler: requestHandler name=/select class=solr.SearchHandler ... arr name=last-components strspellcheck/str /arr /requestHandler I have built up the spellchecker source in the schema.xml from the name field: field name=spell type=spell indexed=true stored=true required=false multiValued=false/ copyField source=name dest=spell maxChars=3 / ... fieldType name=spell class=solr.TextField positionIncrementGap=100 analyzer type=index tokenizer class=solr.StandardTokenizerFactory/ /analyzer analyzer type=query tokenizer class=solr.StandardTokenizerFactory/ /analyzer /fieldType As I'm querying the /select request handler, I should get spellcheck suggestions with my results. However, I rarely get a suggestion. Examples: query: Sichtscheibe, spellcheck suggestion: Sichtscheiben (works) query: Sichtscheib, spellcheck suggestion: Sichtscheiben (works) query: ichtscheiben, no spellcheck suggestions As far as I can identify, I only get suggestions when I get real search results. I get results for the first 2 examples, because the german StemFilterFactory translates Sichtscheibe and Sichtscheiben into Sichtscheib, so there are matches found. However, the third query should result in a suggestion, as the Levenshtein distance is less than in the second example. Suggestions, improvements, corrections?