[ 
https://issues.apache.org/jira/browse/SOLR-8685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15961036#comment-15961036
 ] 

James Dyer commented on SOLR-8685:
----------------------------------

The older IndexBasedSpellChecker can generate less-than-optimal suggestions 
when "spellcheck.count" is very low.  Please ensure you are using the newer 
DirectSolrSpellChecker by adding it to your solrconfig.xml.

{code:xml}
<searchComponent name="spellcheck" 
class="org.apache.solr.handler.component.SpellCheckComponent">
  ... etc ...   
    <lst name="spellchecker">
      ... etc ...
      <str name="classname">solr.DirectSolrSpellChecker</str>    
      ... etc ...
    </lst>
</searchComponent>
{code}


> Different result depending on count
> -----------------------------------
>
>                 Key: SOLR-8685
>                 URL: https://issues.apache.org/jira/browse/SOLR-8685
>             Project: Solr
>          Issue Type: Bug
>          Components: spellchecker
>            Reporter: Nobuo Onodera
>
> I got different result when {{spellcheck.count}} is less than 5. We expect to 
> get "iaad" as the top of result, but actually got "iqad" as the result when 
> {{spellcheck.count=1}}
> spellcheck.count=5
> {code:xml}
> <response>
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">20</int>
> </lst>
> <result name="response" numFound="0" start="0" maxScore="0.0"/>
> <lst name="spellcheck">
> <lst name="suggestions">
> <lst name="icat">
> <int name="numFound">5</int>
> <int name="startOffset">3</int>
> <int name="endOffset">7</int>
> <int name="origFreq">0</int>
> <arr name="suggestion">
> <lst>
> <str name="word">iaad</str>
> <int name="freq">1</int>
> </lst>
> <lst>
> <str name="word">ipad</str>
> <int name="freq">1</int>
> </lst>
> <lst>
> <str name="word">iqad</str>
> <int name="freq">1</int>
> </lst>
> <lst>
> <str name="word">irad</str>
> <int name="freq">1</int>
> </lst>
> <lst>
> <str name="word">isad</str>
> <int name="freq">1</int>
> </lst>
> </arr>
> </lst>
> <bool name="correctlySpelled">false</bool>
> <lst name="collation">
> <str name="collationQuery">to:iaad</str>
> <int name="hits">1</int>
> <lst name="misspellingsAndCorrections">
> <str name="icat">iaad</str>
> </lst>
> </lst>
> <lst name="collation">
> <str name="collationQuery">to:ipad</str>
> <int name="hits">1</int>
> <lst name="misspellingsAndCorrections">
> <str name="icat">ipad</str>
> </lst>
> </lst>
> <lst name="collation">
> <str name="collationQuery">to:iqad</str>
> <int name="hits">1</int>
> <lst name="misspellingsAndCorrections">
> <str name="icat">iqad</str>
> </lst>
> </lst>
> <lst name="collation">
> <str name="collationQuery">to:irad</str>
> <int name="hits">1</int>
> <lst name="misspellingsAndCorrections">
> <str name="icat">irad</str>
> </lst>
> </lst>
> <lst name="collation">
> <str name="collationQuery">to:isad</str>
> <int name="hits">1</int>
> <lst name="misspellingsAndCorrections">
> <str name="icat">isad</str>
> </lst>
> </lst>
> </lst>
> </lst>
> </response>
> {code}
> spellcheck.count=1
> {code:xml}
> <response>
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">16</int>
> </lst>
> <result name="response" numFound="0" start="0" maxScore="0.0"/>
> <lst name="spellcheck">
> <lst name="suggestions">
> <lst name="icat">
> <int name="numFound">1</int>
> <int name="startOffset">3</int>
> <int name="endOffset">7</int>
> <int name="origFreq">0</int>
> <arr name="suggestion">
> <lst>
> <str name="word">iqad</str>
> <int name="freq">1</int>
> </lst>
> </arr>
> </lst>
> <bool name="correctlySpelled">false</bool>
> <lst name="collation">
> <str name="collationQuery">to:iaad</str>
> <int name="hits">1</int>
> <lst name="misspellingsAndCorrections">
> <str name="icat">iaad</str>
> </lst>
> </lst>
> <lst name="collation">
> <str name="collationQuery">to:ipad</str>
> <int name="hits">1</int>
> <lst name="misspellingsAndCorrections">
> <str name="icat">ipad</str>
> </lst>
> </lst>
> <lst name="collation">
> <str name="collationQuery">to:iqad</str>
> <int name="hits">1</int>
> <lst name="misspellingsAndCorrections">
> <str name="icat">iqad</str>
> </lst>
> </lst>
> <lst name="collation">
> <str name="collationQuery">to:irad</str>
> <int name="hits">1</int>
> <lst name="misspellingsAndCorrections">
> <str name="icat">irad</str>
> </lst>
> </lst>
> <lst name="collation">
> <str name="collationQuery">to:isad</str>
> <int name="hits">1</int>
> <lst name="misspellingsAndCorrections">
> <str name="icat">isad</str>
> </lst>
> </lst>
> </lst>
> </lst>
> </response>
> {code}
> As the cause, {{modifyRequest}} method in {{SpellcheckComponent.java}} force 
> to set 5 as {{spellcheck.count}} when {{spellcheck.count}} is less than 5. 
> Then, {{mergeSuggestions}} method in {{SolrSpellChecker.java}} discard some 
> results following code.
> {code:java}
>       // skip the first sugQueue.size() - count elements
>       for (int k=0; k < sugQueue.size() - count; k++) sugQueue.pop();
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: dev-h...@lucene.apache.org

Reply via email to