Re: SynonymFilterFactory case changes

Erick Erickson Tue, 26 Apr 2011 17:10:36 -0700

Ahhh, I mis-read your post..

First, it's not the synonymfilterfactory that's lowercasing anything. The
ingorecase="true" affects the matching, not the output. The output is
probably lowercased because you have it that way in the synonyms.txt
file. At least that's what I just saw using the analysis page from the
Solr admin page.


So yes, if you want the WDF to do anything on tokens put into the input
stream by SynonymFilterFactory, you need to make the
replacement be the accurate case.

But I think you already figured all that out....

Best
Erick

On Tue, Apr 26, 2011 at 7:19 PM, Robert Petersen <rober...@buy.com> wrote:
> But in this case lowercase is after WDF.  The question is that when you get a 
> hit in the SynonymFilter on a synonym and where the entries in synonmyms.txt 
> file are all in lower case do I need to add the case changing versions to 
> make WDF work on case changes because it appears the synonym text is replaced 
> verbatim by what is in the txt file and so that defeats the WDF filter.  In 
> fact, adding the case changing versions of this term to the synonyms.txt file 
> makes this use case work.  (yay)
>
> -----Original Message-----
> From: Erick Erickson [mailto:erickerick...@gmail.com]
> Sent: Tuesday, April 26, 2011 3:39 PM
> To: solr-user@lucene.apache.org
> Subject: Re: SynonymFilterFactory case changes
>
> Yes, order does matter.  You're right, putting, say, lowercase in front
> of WordDelimiter... will mess up the operations of WDFF.
>
> The admin/analysis page is *extremely* useful for understanding what
> happens in the analysis of input. Make sure to check the "verbose"
> checkbox.
>
> Best
> Erick
>
> On Tue, Apr 26, 2011 at 5:10 PM, Robert Petersen <rober...@buy.com> wrote:
>> So if there is a hit in the synonym filter factory, do I need to put the
>> various case changes for a term so that the following
>> WordDelimiterFilter analyzer can do its 'split on case changes' work?
>> Here we see SynonymFilterFactory makes all terms lowercase because this
>> is what is in my synonmyms.txt file and I have ignoreCase=true:
>> "macafee, mcafee"
>>
>> Index Analyzer
>> org.apache.solr.analysis.WhitespaceTokenizerFactory {}
>> term position   1
>> term text       McAfee
>> term type       word
>> source start,end        0,6
>> payload
>> org.apache.solr.analysis.SynonymFilterFactory
>> {synonyms=index_synonyms.txt, expand=true, ignoreCase=true}
>> term position   1
>> term text       macafee
>> mcafee
>> term type       word
>> word
>> source start,end        0,6
>> 0,6
>> payload
>>
>>
>

Re: SynonymFilterFactory case changes

Reply via email to