[ 
https://issues.apache.org/jira/browse/SOLR-10263?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16081922#comment-16081922
 ] 

Abhishek Kumar Singh edited comment on SOLR-10263 at 7/11/17 1:55 PM:
----------------------------------------------------------------------

The problem with "maxCollationTries" is that - a {{collationTry}}  is an 
expensive step. So, there is only a limit to which we can increase its value - 
given a certain level of response time/efficiency requirement. 

A control on {{wordBreak}}  suggestions can give us more freedom to get the 
relevant suggestions in cases where we know how our queries are going to be. 
For example: *gold mine sunglasses* will even give suggestions like *gold mine 
sun glasses*  or even *gold mine sung lasses* and later waste our precious 
{{maxCollationTries}}. In this case,   SUGGEST_WHEN_NOT_IN_INDEX for 
{{wordBreak}} will avoid above suggestions which we already know are not 
required.  

This is why, We faced cases wherein different {{SpellCheckComponents}}  
required different *suggestModes*. 
Also, I think _wordBreak_ and _wordJoin_  (within {{WordBreakSolrSpellCheck}} ) 
should also have different _suggestMode_ configurations because the use cases 
can really vary.  (for the above usecase itself, we want *gold* and *mine* to 
be combine to *goldmine* , so {{wordJoin}} will again have SUGGEST_ALWAYS. )

This is why in our case, after applying the above patch, we had to configure 
{{DirectSolrSpellChecker}} to {{SUGGEST_ALWAYS}} , while only {{wordBreak}}  
was configured as {{SUGGEST_WHEN_NOT_IN_INDEX}} and so that {{wordJoin}}  still 
had  {{SUGGEST_ALWAYS}} . 


was (Author: asingh2411):
The problem with "maxCollationTries" is that - a {{collationTry}}  is an 
expensive step. So, there is only a limit to which we can increase its value - 
given a certain level of response time/efficiency requirement. 

A control on {{wordBreak}}  suggestions can give us more freedom to get the 
relevant suggestions in cases where we know how our queries are going to be. 
For example: *gold mine sunglasses* will even give suggestions like *gold mine 
sun glasses*  or even *gold mine sung lasses* and later waste our precious 
{{maxCollationTries}}. In this case,   SUGGEST_WHEN_NOT_IN_INDEX for 
{{wordBreak}} will avoid above suggestions which we already know are not 
required.  

This is why, We faced cases wherein different {{SpellCheckComponents}}  
required different *suggestModes*. 
Also, I think _wordBreak_ and _wordJoin_  (within {{WordBreakSolrSpellCheck}} ) 
should also have different _suggestMode_ configurations because the use cases 
can really vary.  (for the above usecase itself, we want *gold* and *mine* to 
be combine to *goldmine* , so {{wordJoin}} will again have SUGGEST_ALWAYS. )

This is why in our case, we had to configure {{DirectSolrSpellChecker}} to 
{{SUGGEST_ALWAYS}} , while only {{wordBreak}}  was configured as 
{{SUGGEST_WHEN_NOT_IN_INDEX}} and so that {{wordJoin}}  still had  
{{SUGGEST_ALWAYS}} . 

> Different SpellcheckComponents should have their own suggestMode
> ----------------------------------------------------------------
>
>                 Key: SOLR-10263
>                 URL: https://issues.apache.org/jira/browse/SOLR-10263
>             Project: Solr
>          Issue Type: Wish
>      Security Level: Public(Default Security Level. Issues are Public) 
>          Components: spellchecker
>            Reporter: Abhishek Kumar Singh
>            Priority: Minor
>         Attachments: SOLR-10263.v2.patch
>
>
> As of now, common spellcheck options are applied to all the 
> SpellCheckComponents.
> This can create problem in the following case:-
>  It may be the case that we want *DirectSolrSpellChecker* to ALWAYS_SUGGEST 
> spellcheck suggestions. 
> But we may want *WordBreakSpellChecker* to suggest only if the token is not 
> in the index  (for relevance or performance reasons)  
> (SUGGEST_WHEN_NOT_IN_INDEX) . 
> *UPDATE :* Recently, we also figured out that, for 
> {{WordBreakSolrSpellChecker}} also, both - The {{WordBreak}} and {{WordJoin}} 
> should also have different suggestModes.
> We faced this problem in our case, wherein, Most of the WordJoin cases are 
> those where the words individually are valid tokens, but what the users are 
> looking for is actually a  combination (wordjoin) of the two tokens. 
> For example:-
> *gold mine sunglasses* : Here, both *gold* and *mine* are valid tokens. But 
> the actual product being looked for is *goldmine sunglasses* , where 
> *goldmine* is a brand.
> In such cases, we should recommend {{didYouMean:goldmine sunglasses}} . But 
> this wont be possible because we had set   {{SUGGEST_WHEN_NOT_IN_INDEX}}  for 
> {{WordBreakSolrSpellChecker}} (of which, WordJoin is a part)  . 
> For this, we should have separate suggestModes for both `wordJoin` as well as 
> `wordBreak`. 
> Related changes have been done at Latest PR. : 
> https://github.com/apache/lucene-solr/pull/218. 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to