Which tokenizer are you using?  StandardTokenizer will split "x-box" into "x" 
and "box", same as "x box".

If there's not too many of these, you could also use the 
PatternReplaceCharFilterFactory to map "x box" and "x-box" to "xbox" before the 
tokenizer.

Diego Fernandez - 爱国
Software Engineer
US GSS Supportability - Diagnostics


----- Original Message -----
> Jia,
> 
> I agree that for the spellcheckers to work, you need  <arr
> name="last-components"> instead of <arr name="components">.
> 
> But the "x-box" => "xbox" example ought to be solved by analyzing using
> WordDelimiterFilterFactory and "catenateWords=1" at query-time.  Did you
> re-index after changing your analysis chain (you need to)?  Perhaps you can
> show your full analyzer configuration, and someone here can help you find
> the problem. Also, the Analysis page on the solr Admin UI is invaluable for
> debugging text-field analyzer problems.
> 
> Getting "x box" to analyze to "xbox" is trickier (but possible).  The
> WordBreakSpellChecker is probably your best option if you have cases like
> this in your data & users' queries.
> 
> Of course, if you have a finite number of products that have spelling
> variants like this, SynonymFilterFactory might be all you need.  I would
> recommend using index-time synonyms for your case rather than query-time
> synonyms.
> 
> James Dyer
> Ingram Content Group
> (615) 213-4311
> 
> 
> -----Original Message-----
> From: Ahmet Arslan [mailto:iori...@yahoo.com.INVALID]
> Sent: Wednesday, July 16, 2014 7:42 AM
> To: solr-user@lucene.apache.org; j...@ece.ubc.ca
> Subject: Re: questions on Solr WordBreakSolrSpellChecker and
> WordDelimiterFilterFactory
> 
> Hi Jia,
> 
> What happens when you use
> 
>  <arr name="last-components">
> 
> instead of
> 
>  <arr name="components">
> 
> Ahmet
> 
> 
> On Wednesday, July 16, 2014 3:07 AM, "j...@ece.ubc.ca" <j...@ece.ubc.ca>
> wrote:
> 
> 
> 
> Hello everyone :)
> 
> I have a product called "xbox" indexed, and when the user search for
> either "x-box" or "x box" i want the "xbox" product to be
> returned.  I'm new to Solr, and from reading online, I thought I need
> to use WordDelimiterFilterFactory for "x-box" case, and
> WordBreakSolrSpellChecker for "x box" case. Is this correct?
> 
> (1) In my schema file, this is what I changed:
> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1"
> generateNumberParts="1" catenateWords="1" catenateNumbers="1"
> catenateAll="1" splitOnCaseChange="0" preserveOriginal="1"/>
> 
> But I don't see the xbox product returned when the search term is
> "x-box", so I must have missed something....
> 
> (2) I tried to use  WordBreakSolrSpellChecker together with
> DirectSolrSpellChecker as shown below, but the WordBreakSolrSpellChecker
> never got used:
> 
> <searchComponent name="wc_spellcheck"
> class="solr.SpellCheckComponent">
>     <str name="queryAnalyzerFieldType">wc_textSpell</str>
> 
>     <lst name="spellchecker">
>       <str name="name">default</str>
>       <str name="field">spellCheck</str>
>       <str name="classname">solr.DirectSolrSpellChecker</str>
>       <str name="distanceMeasure">internal</str>
>           <float name="accuracy">0.3</float>
>             <int name="maxEdits">2</int>
>             <int name="minPrefix">1</int>
>             <int name="maxInspections">5</int>
>             <int name="minQueryLength">3</int>
>             <float name="maxQueryFrequency">0.01</float>
>             <float name="thresholdTokenFrequency">0.004</float>
>     </lst>
> <lst name="spellchecker">
>     <str name="name">wordbreak</str>
>     <str name="classname">solr.WordBreakSolrSpellChecker</str>
>     <str name="field">spellCheck</str>
>     <str name="combineWords">true</str>
>     <str name="breakWords">true</str>
>     <int name="maxChanges">10</int>
>   </lst>
>   </searchComponent>
> 
>   <requestHandler name="/spellcheck"
> class="org.apache.solr.handler.component.SearchHandler">
>     <lst name="defaults">
>         <str name="df">SpellCheck</str>
>         <str name="spellcheck">true</str>
>            <str name="spellcheck.dictionary">default</str>
>         <str name="spellcheck.dictionary">wordbreak</str>
>         <str name="spellcheck.build"> true</str>
>            <str name="spellcheck.onlyMorePopular">false</str>
>            <str name="spellcheck.count">10</str>
>            <str name="spellcheck.collate">true</str>
>            <str name="spellcheck.collateExtendedResults">false</str>
>     </lst>
>     <arr name="components">
>       <str>wc_spellcheck</str>
>     </arr>
>   </requestHandler>
> 
> I tried to build the dictionary this way:
> http://localhost/solr/coreName/select?spellcheck=true&spellcheck.build=true,
> but the response returned is this:
> <response>
> <lst name="responseHeader">
> <int name="status">0</int>
> <int name="QTime">0</int>
> <lst name="params">
> <str name="spellcheck.build">true</str>
> <str name="spellcheck">true</str>
> </lst>
> </lst>
> <str name="command">build</str>
> <result name="response" numFound="0" start="0"/>
> </response>
> 
> What's the correct way to build the dictionary?
> Even though my requestHandler's name="/spellcheck", i wasn't able to
> use
> http://localhost/solr/coreName/spellcheck?spellcheck=true&spellcheck.build=true
> .. is there something wrong with my definition above?
> 
> (3) I also tried to use WordBreakSolrSpellChecker without the
> DirectSolrSpellChecker as shown below:
> <searchComponent name="wc_spellcheck"
> class="solr.SpellCheckComponent">
> 
>   <str name="queryAnalyzerFieldType">wc_textSpell</str>
>     <lst name="spellchecker">
>     <str name="name">default</str>
>     <str name="classname">solr.WordBreakSolrSpellChecker</str>
>     <str name="field">spellCheck</str>
>     <str name="combineWords">true</str>
>     <str name="breakWords">true</str>
>     <int name="maxChanges">10</int>
>   </lst>
>    </searchComponent>
> 
>    <requestHandler name="/spellcheck"
> class="org.apache.solr.handler.component.SearchHandler">
>     <lst name="defaults">
>         <str name="df">SpellCheck</str>
>         <str name="spellcheck">true</str>
>            <str name="spellcheck.dictionary">default</str>
>         <!--<str name="spellcheck.dictionary">wordbreak</str> -->
>         <str name="spellcheck.build"> true</str>
>            <str name="spellcheck.onlyMorePopular">false</str>
>            <str name="spellcheck.count">10</str>
>            <str name="spellcheck.collate">true</str>
>            <str name="spellcheck.collateExtendedResults">false</str>
>     </lst>
>     <arr name="components">
>       <str>wc_spellcheck</str>
>     </arr>
>   </requestHandler>
> 
> And still unable to see WordBreakSolrSpellChecker being called anywhere.
> 
> Would someone kindly help me?
> 
> Many thanks,
> Jia
> 
> 
> 

Reply via email to