RE: Odd Edge Case for SpellCheck

2019-11-25 Thread Moyer, Brett
This is a great help, thank you!

Brett Moyer

-Original Message-
From: Erick Erickson  
Sent: Monday, November 25, 2019 4:12 PM
To: solr-user@lucene.apache.org
Subject: Re: Odd Edge Case for SpellCheck

If you’re using direct spell checking, it looks for the _indexed_ term. So this 
means you get stemmed corrections if you’re stemming etc. Usually you should 
use a copyField to a field with minimal analysis and use that field for 
spellchecking.

Another way to thing about it is that if you use the admin/analysis page for 
terms in a field, the terms in the dictionary are what’s at the end of the 
indexed side of the page.

Best,
Erick

> On Nov 25, 2019, at 4:02 PM, Moyer, Brett  wrote:
> 
> Yes we are stemming, ahh so we shouldn't stem our words to be spelled?
> 
> Brett Moyer
> 
> -Original Message-
> From: Jörn Franke 
> Sent: Friday, November 22, 2019 8:34 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Odd Edge Case for SpellCheck
> 
> Stemming involved ?
> 
>> Am 22.11.2019 um 14:23 schrieb Moyer, Brett :
>> 
>> Hello, we have spellcheck running, using the index as the dictionary. An 
>> odd use case came up today wanted to get your thoughts and see if what we 
>> determined is correct. Use case: User sends a query for q=brokerage, 
>> spellcheck fires and returns "brokerage". Looking at the output I see that 
>> solr must have pulled the root word "brokage" then spellcheck said hey I 
>> need to fix that. Is that correct? There's no issue, it's just an unexpected 
>> outcome. Thanks!
>> 
>> "q":"brokerage",
>> "spellcheck":{
>>   "suggestions":
>>   [
>> {"name":"brokage",{
>>   "type":"str","value":"numFound":1,
>>   "startOffset":0,
>>   "endOffset":9,
>>   "suggestion":["brokerage"]}}],
>>   "collations":
>>   [
>> {"name":"collation","type":"str","value":"brokerage"}]}}
>> 
>> Brett Moyer
>> *
>> *
>> *** This e-mail may contain confidential or privileged information.
>> If you are not the intended recipient, please notify the sender immediately 
>> and then delete it.
>> 
>> TIAA
>> *
>> *
>> ***
> **
> *** This e-mail may contain confidential or privileged information.
> If you are not the intended recipient, please notify the sender immediately 
> and then delete it.
> 
> TIAA
> **
> ***

*
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and 
then delete it.

TIAA
*


Re: Odd Edge Case for SpellCheck

2019-11-25 Thread Erick Erickson
If you’re using direct spell checking, it looks for the _indexed_ term. So this 
means you get stemmed corrections if you’re stemming etc. Usually you should 
use a copyField to a field with minimal analysis and use that field for 
spellchecking.

Another way to thing about it is that if you use the admin/analysis page for 
terms in a field, the terms in the dictionary are what’s at the end of the 
indexed side of the page.

Best,
Erick

> On Nov 25, 2019, at 4:02 PM, Moyer, Brett  wrote:
> 
> Yes we are stemming, ahh so we shouldn't stem our words to be spelled?
> 
> Brett Moyer
> 
> -Original Message-
> From: Jörn Franke  
> Sent: Friday, November 22, 2019 8:34 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Odd Edge Case for SpellCheck
> 
> Stemming involved ?
> 
>> Am 22.11.2019 um 14:23 schrieb Moyer, Brett :
>> 
>> Hello, we have spellcheck running, using the index as the dictionary. An 
>> odd use case came up today wanted to get your thoughts and see if what we 
>> determined is correct. Use case: User sends a query for q=brokerage, 
>> spellcheck fires and returns "brokerage". Looking at the output I see that 
>> solr must have pulled the root word "brokage" then spellcheck said hey I 
>> need to fix that. Is that correct? There's no issue, it's just an unexpected 
>> outcome. Thanks!
>> 
>> "q":"brokerage",
>> "spellcheck":{
>>   "suggestions":
>>   [
>> {"name":"brokage",{
>>   "type":"str","value":"numFound":1,
>>   "startOffset":0,
>>   "endOffset":9,
>>   "suggestion":["brokerage"]}}],
>>   "collations":
>>   [
>> {"name":"collation","type":"str","value":"brokerage"}]}}
>> 
>> Brett Moyer
>> **
>> *** This e-mail may contain confidential or privileged information.
>> If you are not the intended recipient, please notify the sender immediately 
>> and then delete it.
>> 
>> TIAA
>> **
>> ***
> *
> This e-mail may contain confidential or privileged information.
> If you are not the intended recipient, please notify the sender immediately 
> and then delete it.
> 
> TIAA
> *



RE: Odd Edge Case for SpellCheck

2019-11-25 Thread Moyer, Brett
Yes we are stemming, ahh so we shouldn't stem our words to be spelled?

Brett Moyer

-Original Message-
From: Jörn Franke  
Sent: Friday, November 22, 2019 8:34 AM
To: solr-user@lucene.apache.org
Subject: Re: Odd Edge Case for SpellCheck

Stemming involved ?

> Am 22.11.2019 um 14:23 schrieb Moyer, Brett :
> 
> Hello, we have spellcheck running, using the index as the dictionary. An odd 
> use case came up today wanted to get your thoughts and see if what we 
> determined is correct. Use case: User sends a query for q=brokerage, 
> spellcheck fires and returns "brokerage". Looking at the output I see that 
> solr must have pulled the root word "brokage" then spellcheck said hey I need 
> to fix that. Is that correct? There's no issue, it's just an unexpected 
> outcome. Thanks!
> 
> "q":"brokerage",
> "spellcheck":{
>"suggestions":
>[
>  {"name":"brokage",{
>"type":"str","value":"numFound":1,
>"startOffset":0,
>"endOffset":9,
>"suggestion":["brokerage"]}}],
>"collations":
>[
>  {"name":"collation","type":"str","value":"brokerage"}]}}
> 
> Brett Moyer
> **
> *** This e-mail may contain confidential or privileged information.
> If you are not the intended recipient, please notify the sender immediately 
> and then delete it.
> 
> TIAA
> **
> ***
*
This e-mail may contain confidential or privileged information.
If you are not the intended recipient, please notify the sender immediately and 
then delete it.

TIAA
*


Re: Odd Edge Case for SpellCheck

2019-11-22 Thread Jörn Franke
Stemming involved ?

> Am 22.11.2019 um 14:23 schrieb Moyer, Brett :
> 
> Hello, we have spellcheck running, using the index as the dictionary. An odd 
> use case came up today wanted to get your thoughts and see if what we 
> determined is correct. Use case: User sends a query for q=brokerage, 
> spellcheck fires and returns "brokerage". Looking at the output I see that 
> solr must have pulled the root word "brokage" then spellcheck said hey I need 
> to fix that. Is that correct? There's no issue, it's just an unexpected 
> outcome. Thanks!
> 
> "q":"brokerage",
> "spellcheck":{
>"suggestions":
>[
>  {"name":"brokage",{
>"type":"str","value":"numFound":1,
>"startOffset":0,
>"endOffset":9,
>"suggestion":["brokerage"]}}],
>"collations":
>[
>  {"name":"collation","type":"str","value":"brokerage"}]}}
> 
> Brett Moyer
> *
> This e-mail may contain confidential or privileged information.
> If you are not the intended recipient, please notify the sender immediately 
> and then delete it.
> 
> TIAA
> *