Re: European Languages search problem

2005-07-28 Thread Martin Rode
Otis, Thanks for the quick reply. The idea to emit multiple tokens is great! I was looking for a solution of another problem: I want to present a word completition list to the user, so I use reader.terms(new Term("start","here"). If I start searching at "henrie", the reader.terms() should re

Re: European Languages search problem

2005-07-28 Thread Otis Gospodnetic
Hi Martin, When you write your own tokenizer/analyzer for this, you'll probably want to emit multiple tokens for words that have umlauts and such - one version with ä -> ae, the other with ä -> a perhaps. As for stripping accents from characters, somebody posted ISOLatinFilter.java (I think that