Hi Jeff,

Hi Jeff,

You have configured ShingleFilterFactory with a token separator of "", so e.g. 
"International Corporation" will output the shingle "InternationalCorporation". 
 If this is the form you want to use for synonym matching, it must exist in 
your synonym file.  Does it?

Steve

> -----Original Message-----
> From: Jeff Wartes [mailto:jwar...@whitepages.com]
> Sent: Wednesday, August 10, 2011 3:43 PM
> To: solr-user@lucene.apache.org
> Subject: Can't mix Synonyms with Shingles?
> 
> 
> I would like to combine the ShingleFilterFactory with a
> SynonymFilterFactory in a field type.
> 
> I've looked at something like this using the analysis.jsp tool:
> 
>     <fieldType name="TestTerm" class="solr.TextField"
> positionIncrementGap="100">
>       <analyzer type="index">
>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>         <filter class="solr.WordDelimiterFilterFactory"
> generateWordParts="1" generateNumberParts="1" stemEnglishPosessive="1"/>
>         <filter class="solr.ShingleFilterFactory" tokenSeparator="" />
>         <filter class="solr.SynonymFilterFactory"
> synonyms="synonyms.BusinessNames.txt" ignoreCase="true" expand="true"/>
>         ...
>       </analyzer>
>       <analyzer type="query">
>           ...
>       </analyzer>
>     </fieldType>
> 
> However, when a ShingleFilterFactory is applied first, the
> SynonymFilterFactory appears to do nothing.
> I haven't found any documentation or other warnings against this
> combination, and I don't want to apply shingles after synonyms (this
> works) because multi-word synonyms then cause severe term expansion. I
> don't really mind if the synonyms fail to match shingles, (although I'd
> prefer they succeed) but I'd at least expect that synonyms would continue
> to match on the original tokens, as they do if I remove the
> ShingleFilterFactory.
> 
> I'm using Solr 3.3, any clarification would be appreciated.
> 
> Thanks,
>   -Jeff Wartes

Reply via email to