Christoph Kiehl wrote:
The analyzer should not do any case-recognition. After I read through the mailing list from the last weeks/months (I was busy last weeks), I found out that a super simple unique-discrimination algorithm is what the most users need. The original algorithm has more possible ways to extend it.Hi Volker,I have noticed a strange problem with capitalization. Search for "computer" results in the token "compu". Search for "Computer", however, results in "comput". The search is supposed to be case-insensitive, so this must be a bug, right?This problem was already mentioned on the developer list. The analyzer tries to do some noun recognition. But it does a bad job ;)
I promise I will check the stemmer next days... hm... not before this weekend, i have a martial arts challenge at sunday. Mental i'm not prepared to _fix_ anything. :)For now you could check out the current lucene version from cvs and just comment out the following line: uppercase = Character.isUpperCase( term.charAt( 0 ) ); Then just run ant to built the jar. This fixes the problem you described.
There is another problem with the Umlaut-conversion that also should be checked.
Greets,
Gerhard
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
