[
https://issues.apache.org/jira/browse/SOLR-4277?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13758364#comment-13758364
]
scott hobson commented on SOLR-4277:
------------------------------------
Hi,
I am having this same issue. The "correctlySpelled" flag is always false. I
understand that it should still be giving suggestions for the "did you mean..."
searches, but shouldn't the correctlySpelled flag at least be accurate? It
could easily say true and still give you suggested words, and that would be
even better because you can differentiate between a suggestion and a
correction. Right now you cannot, unless I'm missing something...
Thanks,
Scott
> Spellchecker sometimes falsely reports a spelling error and correction
> ----------------------------------------------------------------------
>
> Key: SOLR-4277
> URL: https://issues.apache.org/jira/browse/SOLR-4277
> Project: Solr
> Issue Type: Bug
> Components: spellchecker
> Affects Versions: 4.0
> Reporter: Jack Krupansky
>
> In some cases, the Solr spell checker improperly reports query terms as being
> misspelled.
> Using the Solr example for 4.0, I added these mini documents:
> {code}
> curl http://localhost:8983/solr/update?commit=true -H
> 'Content-type:application/csv' -d '
> id,name
> spel-1,aardvark abacus ball bill cat cello
> spel-2,abate accord band bell cattle check
> spel-3,adorn border clean clock'
> {code}
> I then issued this request:
> {code}
> curl "http://localhost:8983/solr/spell/?q=check&indent=true"
> {code}
> The spell checker falsely concluded that "check" was misspelled and
> improperly corrected it to "clock":
> {code}
> <lst name="spellcheck">
> <lst name="suggestions">
> <lst name="check">
> <int name="numFound">1</int>
> <int name="startOffset">0</int>
> <int name="endOffset">5</int>
> <int name="origFreq">1</int>
> <arr name="suggestion">
> <lst>
> <str name="word">clock</str>
> <int name="freq">1</int>
> </lst>
> </arr>
> </lst>
> <bool name="correctlySpelled">false</bool>
> <lst name="collation">
> <str name="collationQuery">clock</str>
> <int name="hits">1</int>
> <lst name="misspellingsAndCorrections">
> <str name="check">clock</str>
> </lst>
> </lst>
> </lst>
> </lst>
> {code}
> And if I query for "clock", it gets corrected to "check"!
> {code}
> curl "http://localhost:8983/solr/spell/?q=clock&indent=true"
> {code}
> {code}
> <lst name="suggestions">
> <lst name="clock">
> <int name="numFound">1</int>
> <int name="startOffset">0</int>
> <int name="endOffset">5</int>
> <int name="origFreq">1</int>
> <arr name="suggestion">
> <lst>
> <str name="word">check</str>
> <int name="freq">1</int>
> </lst>
> </arr>
> </lst>
> <bool name="correctlySpelled">false</bool>
> <lst name="collation">
> <str name="collationQuery">check</str>
> <int name="hits">1</int>
> <lst name="misspellingsAndCorrections">
> <str name="clock">check</str>
> </lst>
> </lst>
> </lst>
> {code}
> Note: This appears to be only because "clock" is so close to "check". With
> other terms I don't see the problem:
> {code}
> curl "http://localhost:8983/solr/spell/?q=cattle+abate+check&indent=true"
> {code}
> {code}
> <lst name="suggestions">
> <lst name="check">
> <int name="numFound">1</int>
> <int name="startOffset">13</int>
> <int name="endOffset">18</int>
> <int name="origFreq">1</int>
> <arr name="suggestion">
> <lst>
> <str name="word">clock</str>
> <int name="freq">1</int>
> </lst>
> </arr>
> </lst>
> <bool name="correctlySpelled">false</bool>
> <lst name="collation">
> <str name="collationQuery">cattle abate clock</str>
> <int name="hits">2</int>
> <lst name="misspellingsAndCorrections">
> <str name="cattle">cattle</str>
> <str name="abate">abate</str>
> <str name="check">clock</str>
> </lst>
> </lst>
> </lst>
> {code}
> Although, it inappropriately lists "cattle" and "abate" in the "misspellings"
> section even though no suggestions were offered.
> Finally, I can workaround this issue by removing the following line from
> solrconfig.xml:
> {code}
> <str name="spellcheck.alternativeTermCount">5</str>
> {code}
> Which responds to the previous request with:
> {code}
> <lst name="suggestions">
> <bool name="correctlySpelled">false</bool>
> </lst>
> {code}
> Which makes the original problem go away. Although, it does beg the question
> as to why my 100% correct query is still tagged as "correctlySpelled" =
> "false", but that's a separate Jira.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]