Re: How configure SnowballAnalyzer to language Spanish

Yonik Seeley Tue, 28 Nov 2006 10:11:11 -0800

Hi Iris,

An "Analyzer" is just a tokenizer followed by a series of token filters.
Stick with the TextField that you defined below and you should be fine.
I'm not sure how the Spanish stemmer works, and if it expects to work
on accented characters... if so, you may want to move
ISOLatin1AccentFilterFactory after the stemmer.


-Yonik

On 11/27/06, Iris Soto <[EMAIL PROTECTED]> wrote:

Hello,

I am trying to configure Solr to index a Spanish site and I am hitting
some problems.
I have a basic install using the Tomcat.

Into schema.xml file i have the following:

<fieldtype name="text_es" class="solr.TextField"
positionIncrementGap="100">
     <analyzer>
         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
         <filter class="solr.ISOLatin1AccentFilterFactory"/>
         <filter class="solr.StopFilterFactory" ignoreCase="true"/>
         <filter class="solr.LowerCaseFilterFactory"/>
         <filter class="solr.SnowballPorterFilterFactory"
language="Spanish"/>
     </analyzer>
   </fieldtype>

In Solr wiki appears package:
org.apache.lucene.analysis.snowball.SnowballAnalyzer, how can i specify
the type of language to use it?
<analyzer class="org.apache.lucene.analysis.snowball.SnowballAnalyzer">

I want that ISOLatin1AccentFilterFactory delete accented forms, like: á,
é, ñ... , but in case of queries, this process doesn't works, because it
should search words that contains that accented forms.
Is good this code? How can i configure the analyzer to Spanish language?

Thanks & Regards,


--
Iris Soto

Re: How configure SnowballAnalyzer to language Spanish

Reply via email to