special character with lucene

2005-02-28 Thread Philipp_Breuss
Hello, I would like to build a search engine using several different languages - f.e. Spanish names, French names, ... - Using a different analyzer for each language would be one solution. - But how about replacing each special character (Umlaute, ...ä, ö, ...) with its html special character

Re: special character with lucene

2005-02-28 Thread Philipp_Breuss
Usually the text is in one specific language. English, German, Spanish, French, ... However, I dont really have a runtime identifier which language it is. I could only pick a few words and decide from there (?) - if this is a good idea? Is there a tool part of lucene that helps deciding what

WG: Re: special character with lucene

2005-02-28 Thread Philipp_Breuss
My file.encoding is set to Cp1252. Maybe this is the reason. However, its a good point replacing all the Umlaute Ä, ... with A, ... before indexing, such that people with non-Umlaut keyboards can search for them. I might do that. Greetings, Philipp Daniel Naber [EMAIL PROTECTED]