Hello,
I would like to build a search engine using several different languages -
f.e. Spanish names, French names, ...
- Using a different analyzer for each language would be one solution.
- But how about replacing each special character (Umlaute, ...ä, ö, ...)
with its html special character
Usually the text is in one specific language. English, German, Spanish,
French, ...
However, I dont really have a runtime identifier which language it is. I
could only pick a few words and decide from there (?) - if this is a good
idea?
Is there a tool part of lucene that helps deciding what
My file.encoding is set to Cp1252. Maybe this is the reason.
However, its a good point replacing all the Umlaute Ä, ... with A, ...
before indexing, such that people with non-Umlaut keyboards can search for
them. I might do that.
Greetings,
Philipp
Daniel Naber [EMAIL PROTECTED]