Hi Iris, An "Analyzer" is just a tokenizer followed by a series of token filters. Stick with the TextField that you defined below and you should be fine. I'm not sure how the Spanish stemmer works, and if it expects to work on accented characters... if so, you may want to move ISOLatin1AccentFilterFactory after the stemmer.
-Yonik On 11/27/06, Iris Soto <[EMAIL PROTECTED]> wrote:
Hello, I am trying to configure Solr to index a Spanish site and I am hitting some problems. I have a basic install using the Tomcat. Into schema.xml file i have the following: <fieldtype name="text_es" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.ISOLatin1AccentFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true"/> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.SnowballPorterFilterFactory" language="Spanish"/> </analyzer> </fieldtype> In Solr wiki appears package: org.apache.lucene.analysis.snowball.SnowballAnalyzer, how can i specify the type of language to use it? <analyzer class="org.apache.lucene.analysis.snowball.SnowballAnalyzer"> I want that ISOLatin1AccentFilterFactory delete accented forms, like: á, é, ñ... , but in case of queries, this process doesn't works, because it should search words that contains that accented forms. Is good this code? How can i configure the analyzer to Spanish language? Thanks & Regards, -- Iris Soto