[
https://issues.apache.org/jira/browse/LUCENE-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784928#action_12784928
]
Simon Willnauer commented on LUCENE-2102:
-----------------------------------------
bq. ok, I will change this. You are right, I would look at this problem
differently if we didnt have CharacterUtil which makes it just so easy to
support the old and new behavior.
Actually, a unused Version argument is silly. If we have to add it in the
future because of some change, you WANT to deprecate the ctor to make users
aware of it. that is what deprecations are made for. I would not argue about
consistency as not every TokenFilter has a Version ctor. (EdgeNGramTokenFilter
for instance - this is just first coming to my mind). I would remove it
completely! Use Character.codePointAt() and you are good to go.
> LowerCaseFilter for Turkish language
> ------------------------------------
>
> Key: LUCENE-2102
> URL: https://issues.apache.org/jira/browse/LUCENE-2102
> Project: Lucene - Java
> Issue Type: Improvement
> Components: Analysis
> Affects Versions: 3.0
> Reporter: Ahmet Arslan
> Assignee: Robert Muir
> Priority: Minor
> Fix For: 3.1
>
> Attachments: LUCENE-2102.patch, LUCENE-2102.patch, LUCENE-2102.patch,
> LUCENE-2102.patch, LUCENE-2102.patch, LUCENE-2102.patch, LUCENE-2102.patch
>
>
> java.lang.Character.toLowerCase() converts 'I' to 'i' however in Turkish
> alphabet lowercase of 'I' is not 'i'. It is LATIN SMALL LETTER DOTLESS I.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]