See https://issues.apache.org/jira/browse/LUCENE-1417 and 
http://lucene.markmail.org/message/sktohlgqxcpmpf7z?q=list:org%2Eapache%2Elucene%2Esolr-user+spellchecker+Rennie

In short, frequency is the second order sort level. I think it should be made pluggable. A patch would be most welcome. I don't have time to produce one at the moment, but can shepherd it through.

FWIW, you might also try the Jaro-Winkler (JW) distance as the default. Edit distance is not as good, since it treats differences the same no matter where in the word they occur, whereas most people tend to make spelling mistakes later on in a word, which I believe JW takes into account when scoring.

On Nov 11, 2008, at 11:52 AM, Jeff Newburn wrote:

Ok. I have managed to get the search component added (You rock Grant). I am having some interesting issues now with the suggestions. We sell shoes
online so I am trying to get it to spellcheck for brand name.

When I search konverse with spelling on it returns converse correctly
however when I search nice (instead of nike) I am returned all sorts of results not sorted by frequency. I have even turned on onlyMorePopular but it still is returning all of the different words in no order. Nike is by
far the most frequent term how do I get it to the top?

I am currently using the svn build of solr1.4.  I have included the
configuration as well as the resultset return for spelling suggestions.


Below is the configuration:
 <searchComponent name="spellcheck" class="solr.SpellCheckComponent">

   <!--<str name="queryAnalyzerFieldType">textSpell</str>-->
   <str name="buildOnCommit">true</str>

   <lst name="spellchecker">
     <str name="name">default</str>
     <str name="classname">solr.IndexBasedSpellChecker</str>
     <str name="field">word</str>
     <str name="spellcheckIndexDir">./spellchecker1</str>
     <str name="accuracy">0.5</str>
   </lst>
   <lst name="spellchecker">
     <str name="name">jarowinkler</str>
     <str name="field">word</str>
     <!-- Use a different Distance Measure -->
     <str
name = "distanceMeasure ">org.apache.lucene.search.spell.JaroWinklerDistance</s
tr>
     <str name="spellcheckIndexDir">./spellchecker2</str>

   </lst>

   <lst name="spellchecker">
     <str name="classname">solr.FileBasedSpellChecker</str>
     <str name="name">file</str>
     <str name="sourceLocation">spellings.txt</str>
     <str name="characterEncoding">UTF-8</str>
     <str name="indexDir">./spellcheckerFile</str>
   </lst>
 </searchComponent>

Return results:
<lst name="spellcheck">
?
<lst name="suggestions">
?
<lst name="nice">
<int name="numFound">20</int>
<int name="startOffset">0</int>
<int name="endOffset">4</int>
<int name="origFreq">0</int>
?
<lst name="suggestion">
<int name="frequency">47</int>
<str name="word">Mice</str>
</lst>
?
<lst name="suggestion">
<int name="frequency">26</int>
<str name="word">Vice</str>
</lst>
?
<lst name="suggestion">
<int name="frequency">14</int>
<str name="word">Nice</str>
</lst>
?
<lst name="suggestion">
<int name="frequency">4</int>
<str name="word">Bice</str>
</lst>
?
<lst name="suggestion">
<int name="frequency">1</int>
<str name="word">Dice</str>
</lst>
?
<lst name="suggestion">
<int name="frequency">4099</int>
<str name="word">Nike</str>
</lst>


On 11/11/08 4:39 AM, "Grant Ingersoll" <[EMAIL PROTECTED]> wrote:

Hi Jeff,

A SearchComponent allows you to connect functionality with any Request
Handler, allowing you to inline spelling requests (or other things
like MoreLikeThis) with your queries, saving you from having to make
an extra request.

I walk through a lot of this in my article on Solr 1.3 for IBM
devWorks:
http://www.ibm.com/developerworks/java/library/j-solr-update/?S_TACT=105AGX01&;
S_CMP=HP

You can also refer to the Wiki at:
http://wiki.apache.org/solr/SearchComponent
and specifically:
http://wiki.apache.org/solr/SpellCheckComponent

It works independently from the query parser (i.e. dismax).

-Grant


On Nov 10, 2008, at 7:00 PM, Jeff Newburn wrote:

I am still relatively new to solr.  I have gotten the
spellcheckerrequesthandler working the way I would like.  Now I am
diving
into the search component version of the spell checker. I was hoping
someone could help explain 1. What specifically does the
searchcomponent
offer and how would I go about putting it into all search terms with
the
dismax type.

-Jeff

--------------------------
Grant Ingersoll

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ












--------------------------
Grant Ingersoll

Lucene Helpful Hints:
http://wiki.apache.org/lucene-java/BasicsOfPerformance
http://wiki.apache.org/lucene-java/LuceneFAQ










Reply via email to