RE: Odd Edge Case for SpellCheck
This is a great help, thank you! Brett Moyer -Original Message- From: Erick Erickson Sent: Monday, November 25, 2019 4:12 PM To: solr-user@lucene.apache.org Subject: Re: Odd Edge Case for SpellCheck If you’re using direct spell checking, it looks for the _indexed_ term. So this means you get stemmed corrections if you’re stemming etc. Usually you should use a copyField to a field with minimal analysis and use that field for spellchecking. Another way to thing about it is that if you use the admin/analysis page for terms in a field, the terms in the dictionary are what’s at the end of the indexed side of the page. Best, Erick > On Nov 25, 2019, at 4:02 PM, Moyer, Brett wrote: > > Yes we are stemming, ahh so we shouldn't stem our words to be spelled? > > Brett Moyer > > -Original Message- > From: Jörn Franke > Sent: Friday, November 22, 2019 8:34 AM > To: solr-user@lucene.apache.org > Subject: Re: Odd Edge Case for SpellCheck > > Stemming involved ? > >> Am 22.11.2019 um 14:23 schrieb Moyer, Brett : >> >> Hello, we have spellcheck running, using the index as the dictionary. An >> odd use case came up today wanted to get your thoughts and see if what we >> determined is correct. Use case: User sends a query for q=brokerage, >> spellcheck fires and returns "brokerage". Looking at the output I see that >> solr must have pulled the root word "brokage" then spellcheck said hey I >> need to fix that. Is that correct? There's no issue, it's just an unexpected >> outcome. Thanks! >> >> "q":"brokerage", >> "spellcheck":{ >> "suggestions": >> [ >> {"name":"brokage",{ >> "type":"str","value":"numFound":1, >> "startOffset":0, >> "endOffset":9, >> "suggestion":["brokerage"]}}], >> "collations": >> [ >> {"name":"collation","type":"str","value":"brokerage"}]}} >> >> Brett Moyer >> * >> * >> *** This e-mail may contain confidential or privileged information. >> If you are not the intended recipient, please notify the sender immediately >> and then delete it. >> >> TIAA >> * >> * >> *** > ** > *** This e-mail may contain confidential or privileged information. > If you are not the intended recipient, please notify the sender immediately > and then delete it. > > TIAA > ** > *** * This e-mail may contain confidential or privileged information. If you are not the intended recipient, please notify the sender immediately and then delete it. TIAA *
Re: Odd Edge Case for SpellCheck
If you’re using direct spell checking, it looks for the _indexed_ term. So this means you get stemmed corrections if you’re stemming etc. Usually you should use a copyField to a field with minimal analysis and use that field for spellchecking. Another way to thing about it is that if you use the admin/analysis page for terms in a field, the terms in the dictionary are what’s at the end of the indexed side of the page. Best, Erick > On Nov 25, 2019, at 4:02 PM, Moyer, Brett wrote: > > Yes we are stemming, ahh so we shouldn't stem our words to be spelled? > > Brett Moyer > > -Original Message- > From: Jörn Franke > Sent: Friday, November 22, 2019 8:34 AM > To: solr-user@lucene.apache.org > Subject: Re: Odd Edge Case for SpellCheck > > Stemming involved ? > >> Am 22.11.2019 um 14:23 schrieb Moyer, Brett : >> >> Hello, we have spellcheck running, using the index as the dictionary. An >> odd use case came up today wanted to get your thoughts and see if what we >> determined is correct. Use case: User sends a query for q=brokerage, >> spellcheck fires and returns "brokerage". Looking at the output I see that >> solr must have pulled the root word "brokage" then spellcheck said hey I >> need to fix that. Is that correct? There's no issue, it's just an unexpected >> outcome. Thanks! >> >> "q":"brokerage", >> "spellcheck":{ >> "suggestions": >> [ >> {"name":"brokage",{ >> "type":"str","value":"numFound":1, >> "startOffset":0, >> "endOffset":9, >> "suggestion":["brokerage"]}}], >> "collations": >> [ >> {"name":"collation","type":"str","value":"brokerage"}]}} >> >> Brett Moyer >> ** >> *** This e-mail may contain confidential or privileged information. >> If you are not the intended recipient, please notify the sender immediately >> and then delete it. >> >> TIAA >> ** >> *** > * > This e-mail may contain confidential or privileged information. > If you are not the intended recipient, please notify the sender immediately > and then delete it. > > TIAA > *
RE: Odd Edge Case for SpellCheck
Yes we are stemming, ahh so we shouldn't stem our words to be spelled? Brett Moyer -Original Message- From: Jörn Franke Sent: Friday, November 22, 2019 8:34 AM To: solr-user@lucene.apache.org Subject: Re: Odd Edge Case for SpellCheck Stemming involved ? > Am 22.11.2019 um 14:23 schrieb Moyer, Brett : > > Hello, we have spellcheck running, using the index as the dictionary. An odd > use case came up today wanted to get your thoughts and see if what we > determined is correct. Use case: User sends a query for q=brokerage, > spellcheck fires and returns "brokerage". Looking at the output I see that > solr must have pulled the root word "brokage" then spellcheck said hey I need > to fix that. Is that correct? There's no issue, it's just an unexpected > outcome. Thanks! > > "q":"brokerage", > "spellcheck":{ >"suggestions": >[ > {"name":"brokage",{ >"type":"str","value":"numFound":1, >"startOffset":0, >"endOffset":9, >"suggestion":["brokerage"]}}], >"collations": >[ > {"name":"collation","type":"str","value":"brokerage"}]}} > > Brett Moyer > ** > *** This e-mail may contain confidential or privileged information. > If you are not the intended recipient, please notify the sender immediately > and then delete it. > > TIAA > ** > *** * This e-mail may contain confidential or privileged information. If you are not the intended recipient, please notify the sender immediately and then delete it. TIAA *
Re: Odd Edge Case for SpellCheck
Stemming involved ? > Am 22.11.2019 um 14:23 schrieb Moyer, Brett : > > Hello, we have spellcheck running, using the index as the dictionary. An odd > use case came up today wanted to get your thoughts and see if what we > determined is correct. Use case: User sends a query for q=brokerage, > spellcheck fires and returns "brokerage". Looking at the output I see that > solr must have pulled the root word "brokage" then spellcheck said hey I need > to fix that. Is that correct? There's no issue, it's just an unexpected > outcome. Thanks! > > "q":"brokerage", > "spellcheck":{ >"suggestions": >[ > {"name":"brokage",{ >"type":"str","value":"numFound":1, >"startOffset":0, >"endOffset":9, >"suggestion":["brokerage"]}}], >"collations": >[ > {"name":"collation","type":"str","value":"brokerage"}]}} > > Brett Moyer > * > This e-mail may contain confidential or privileged information. > If you are not the intended recipient, please notify the sender immediately > and then delete it. > > TIAA > *