[
https://issues.apache.org/jira/browse/LUCENE-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784907#action_12784907
]
Simon Willnauer commented on LUCENE-2102:
-----------------------------------------
Robert, I see your point. The root cause why this bugs me is that this
TokenFilter changes his behavior (at least if you index deseret with this
analyzer) :) depending on the passed version. I don't think that new code
should try do anything based on version. The Version ctor it totally ok for me
in this case but we should really use Integer.CodePointAt() instead of
CharacterUtil. Once I think about the mess ensureOnOrAfter would create
throughout all the code I doubt it would to any good in the end.
> LowerCaseFilter for Turkish language
> ------------------------------------
>
> Key: LUCENE-2102
> URL: https://issues.apache.org/jira/browse/LUCENE-2102
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Analysis
> Affects Versions: 3.0
> Reporter: Ahmet Arslan
> Assignee: Robert Muir
> Priority: Minor
> Fix For: 3.1
>
> Attachments: LUCENE-2102.patch, LUCENE-2102.patch, LUCENE-2102.patch,
> LUCENE-2102.patch, LUCENE-2102.patch, LUCENE-2102.patch
>
>
> java.lang.Character.toLowerCase() converts 'I' to 'i' however in Turkish
> alphabet lowercase of 'I' is not 'i'. It is LATIN SMALL LETTER DOTLESS I.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]