Hi Matt,

Thanks for the reply. I've done what you said and I get exactly what you're saying as a result. Any ideas about how to make 2WD and 4WD be terms on their own?

THanks

On Dec 10, 2007, at 11:41 AM, Matt Kangas wrote:

Brendan, pull up your Solr Admin "Analysis" page and try running your queries through that. The output will tell you precisely how each analyzer affects your tokens on either the index or query side.

In my own quick test, WordDelimiterFilterFactory seems inclined to break "2WD" into ("2","WD")

(using org.apache.solr.analysis.WordDelimiterFilterFactory {catenateWords=1, catenateNumbers=1, catenateAll=0, generateNumberParts=1, generateWordParts=1})

--matt

On Dec 9, 2007, at 6:41 PM, Brendan Grainger wrote:

Hi,

I hope you can help me. I'm having an odd problem with solr. I have a field that could be represent a car. A car could have a name like "Silverado" or could be something like "Silverado 2WD" to denote the 2 wheel drive version of the car. Anyway, all is well when I search over the field for "Silverado", but when I try searching for "2WD" (doesn't matter what case) nothing is returned. Same applies for "Silverado 2WD" etc. I currently have the field defined as text, ie:

<field name="car_name" type="text" indexed="true" stored="true" />

But I've also tried defining my own (simpler) field with no luck. FYI my text field is defined like this:

<fieldType name="text" class="solr.TextField" positionIncrementGap="100">
     <analyzer type="index">
        <!-- This is supposed to remove HTML tags before indexing -->
        <tokenizer class="solr.HTMLStripWhitespaceTokenizerFactory"/>
        <!--
       <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        -->
<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="1" catenateNumbers="1" catenateAll="0"/>
       <filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
       <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
     </analyzer>
     <analyzer type="query">
       <tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt"/> <filter class="solr.WordDelimiterFilterFactory" generateWordParts="1" generateNumberParts="1" catenateWords="0" catenateNumbers="0" catenateAll="0"/>
       <filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.EnglishPorterFilterFactory" protected="protwords.txt"/>
       <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
     </analyzer>
   </fieldType>

Any help?

Thanks!
Brendan

--
Matt Kangas / [EMAIL PROTECTED]



Reply via email to