Jia,

I agree that for the spellcheckers to work, you need  <arr 
name="last-components"> instead of <arr name="components">.

But the "x-box" => "xbox" example ought to be solved by analyzing using 
WordDelimiterFilterFactory and "catenateWords=1" at query-time.  Did you 
re-index after changing your analysis chain (you need to)?  Perhaps you can 
show your full analyzer configuration, and someone here can help you find the 
problem. Also, the Analysis page on the solr Admin UI is invaluable for 
debugging text-field analyzer problems.

Getting "x box" to analyze to "xbox" is trickier (but possible).  The 
WordBreakSpellChecker is probably your best option if you have cases like this 
in your data & users' queries. 

Of course, if you have a finite number of products that have spelling variants 
like this, SynonymFilterFactory might be all you need.  I would recommend using 
index-time synonyms for your case rather than query-time synonyms.

James Dyer
Ingram Content Group
(615) 213-4311


-----Original Message-----
From: Ahmet Arslan [mailto:iori...@yahoo.com.INVALID] 
Sent: Wednesday, July 16, 2014 7:42 AM
To: solr-user@lucene.apache.org; j...@ece.ubc.ca
Subject: Re: questions on Solr WordBreakSolrSpellChecker and 
WordDelimiterFilterFactory

Hi Jia,

What happens when you use 

 <arr name="last-components">

instead of 

 <arr name="components">

Ahmet


On Wednesday, July 16, 2014 3:07 AM, "j...@ece.ubc.ca" <j...@ece.ubc.ca> wrote:



Hello everyone :)

I have a product called "xbox" indexed, and when the user search for
either "x-box" or "x box" i want the "xbox" product to be
returned.  I'm new to Solr, and from reading online, I thought I need
to use WordDelimiterFilterFactory for "x-box" case, and
WordBreakSolrSpellChecker for "x box" case. Is this correct?

(1) In my schema file, this is what I changed:
<filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
generateNumberParts="1" catenateWords="1" catenateNumbers="1"
catenateAll="1" splitOnCaseChange="0" preserveOriginal="1"/>

But I don't see the xbox product returned when the search term is
"x-box", so I must have missed something....

(2) I tried to use  WordBreakSolrSpellChecker together with
DirectSolrSpellChecker as shown below, but the WordBreakSolrSpellChecker
never got used:

<searchComponent name="wc_spellcheck"
class="solr.SpellCheckComponent">
    <str name="queryAnalyzerFieldType">wc_textSpell</str>

    <lst name="spellchecker">
      <str name="name">default</str>
      <str name="field">spellCheck</str>
      <str name="classname">solr.DirectSolrSpellChecker</str>
      <str name="distanceMeasure">internal</str>
          <float name="accuracy">0.3</float>
            <int name="maxEdits">2</int>
            <int name="minPrefix">1</int>
            <int name="maxInspections">5</int>
            <int name="minQueryLength">3</int>
            <float name="maxQueryFrequency">0.01</float>
            <float name="thresholdTokenFrequency">0.004</float>
    </lst>
<lst name="spellchecker">
    <str name="name">wordbreak</str>
    <str name="classname">solr.WordBreakSolrSpellChecker</str>
    <str name="field">spellCheck</str>
    <str name="combineWords">true</str>
    <str name="breakWords">true</str>
    <int name="maxChanges">10</int>
  </lst>
  </searchComponent>

  <requestHandler name="/spellcheck"
class="org.apache.solr.handler.component.SearchHandler">
    <lst name="defaults">
        <str name="df">SpellCheck</str>
        <str name="spellcheck">true</str>
           <str name="spellcheck.dictionary">default</str>
        <str name="spellcheck.dictionary">wordbreak</str>
        <str name="spellcheck.build"> true</str>
           <str name="spellcheck.onlyMorePopular">false</str>
           <str name="spellcheck.count">10</str>
           <str name="spellcheck.collate">true</str>
           <str name="spellcheck.collateExtendedResults">false</str>
    </lst>
    <arr name="components">
      <str>wc_spellcheck</str>
    </arr>
  </requestHandler>

I tried to build the dictionary this way:
http://localhost/solr/coreName/select?spellcheck=true&spellcheck.build=true,
but the response returned is this:
<response>
<lst name="responseHeader">
<int name="status">0</int>
<int name="QTime">0</int>
<lst name="params">
<str name="spellcheck.build">true</str>
<str name="spellcheck">true</str>
</lst>
</lst>
<str name="command">build</str>
<result name="response" numFound="0" start="0"/>
</response>

What's the correct way to build the dictionary?
Even though my requestHandler's name="/spellcheck", i wasn't able to
use
http://localhost/solr/coreName/spellcheck?spellcheck=true&spellcheck.build=true
.. is there something wrong with my definition above?

(3) I also tried to use WordBreakSolrSpellChecker without the
DirectSolrSpellChecker as shown below:
<searchComponent name="wc_spellcheck"
class="solr.SpellCheckComponent">

  <str name="queryAnalyzerFieldType">wc_textSpell</str>
    <lst name="spellchecker">
    <str name="name">default</str>
    <str name="classname">solr.WordBreakSolrSpellChecker</str>
    <str name="field">spellCheck</str>
    <str name="combineWords">true</str>
    <str name="breakWords">true</str>
    <int name="maxChanges">10</int>
  </lst>
   </searchComponent>

   <requestHandler name="/spellcheck"
class="org.apache.solr.handler.component.SearchHandler">
    <lst name="defaults">
        <str name="df">SpellCheck</str>
        <str name="spellcheck">true</str>
           <str name="spellcheck.dictionary">default</str>
        <!--<str name="spellcheck.dictionary">wordbreak</str> -->
        <str name="spellcheck.build"> true</str>
           <str name="spellcheck.onlyMorePopular">false</str>
           <str name="spellcheck.count">10</str>
           <str name="spellcheck.collate">true</str>
           <str name="spellcheck.collateExtendedResults">false</str>
    </lst>
    <arr name="components">
      <str>wc_spellcheck</str>
    </arr>
  </requestHandler>

And still unable to see WordBreakSolrSpellChecker being called anywhere.

Would someone kindly help me?

Many thanks,
Jia


Reply via email to