RE: Searching with accents

Binkley, Peter Thu, 01 Feb 2007 09:29:33 -0800

Within Lucene the solution is to index the accented and unaccented
versions of the word at the same position (i.e. without incrementing the
position counter).  Perhaps this could be added as an option to the
ISOLatin1AccentFilter? Or perhaps it's already there?


Peter

-----Original Message-----
From: Manuel Albela Miranda [mailto:[EMAIL PROTECTED] 
Sent: Thursday, February 01, 2007 8:35 AM
To: [email protected]
Subject: Re: Searching with accents

Thorsten Scherler wrote:
> On Thu, 2007-02-01 at 12:37 +0100, Manuel Albela Miranda wrote:
>   
>> Hello everybody,
>>
>> Do you know if there is a way to search with and without accents
without 
>>   duplicate a field?.
>>
>> I have a large index (60Gb) and don't want to have two fields with 
>> the same content one with accents and the other one without them 
>> because this field is the biggest in the index.
>>
>> Again, hope you can help me.
>>     
>
> Try something like this in your schema.xml:
> <fieldtype name="stringSimilar" class="solr.TextField"
> positionIncrementGap="100">
>       <analyzer type="index">
>         <tokenizer class="solr.LowerCaseTokenizerFactory"/>
>         <filter class="solr.ISOLatin1AccentFilterFactory"/>
>       </analyzer>
>       <analyzer type="query">
>         <tokenizer class="solr.LowerCaseTokenizerFactory"/>
>         <filter class="solr.ISOLatin1AccentFilterFactory"/>
>       </analyzer>
>     </fieldtype>
>
> HTH
>
> salu2
>
>   
>> Thank you very much.
>>
>> Regards.
>>
>> Manu
>>
>>     
Hi Thorsten,

First of all, thank you for your message. I've working around the
schema.xml file with the lines you sent me. Now i can filter the query,
but the problem is that i have accents in my index so, when i search for
words with accents, solr only search for the word without them and i
need both of them. I don't know if there is a way to do this.

Regards.

Manu.

RE: Searching with accents

Reply via email to