StandardAnalyzer works well for most European languages. The problem will be stemming. Applying stemming via English rules to non-English languages produces...er...interesting results.
You can go ahead and create language-specific fields for each language and use StandardAnalyzer with the appropriate stopwords and stemming with each, this is a common approach.. The Snowball stemmer takes a language parameter... You need to use specific analyzers for Chinese Japanese Korean (CJK) documents though. Hope that helps Erick On Sun, Mar 13, 2011 at 7:23 PM, Vasiliki Gkouta <vgko...@csd.auth.gr> wrote: > Hello everybody, > > I have an enquiry about StandardAnalyzer. Can I use it for other languages > except from English? I give the right list of stop words at initialization. > Is there anything else inside the class that is by default set in English? > I've found the Analyzers for other languages too but they where seem to be > deprecated.. Moreover I use english and other languages, all together in my > project so I would like to ask if there is a way to use either the same > class analyzer for all of them, or analyzers of the same functionality for > all the languages. Thanks in advance! > > Best regards, > Vicky > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org > For additional commands, e-mail: java-user-h...@lucene.apache.org > > --------------------------------------------------------------------- To unsubscribe, e-mail: java-user-unsubscr...@lucene.apache.org For additional commands, e-mail: java-user-h...@lucene.apache.org