[ https://issues.apache.org/jira/browse/LUCENE-2102?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12784928#action_12784928 ]
Simon Willnauer commented on LUCENE-2102: ----------------------------------------- bq. ok, I will change this. You are right, I would look at this problem differently if we didnt have CharacterUtil which makes it just so easy to support the old and new behavior. Actually, a unused Version argument is silly. If we have to add it in the future because of some change, you WANT to deprecate the ctor to make users aware of it. that is what deprecations are made for. I would not argue about consistency as not every TokenFilter has a Version ctor. (EdgeNGramTokenFilter for instance - this is just first coming to my mind). I would remove it completely! Use Character.codePointAt() and you are good to go. > LowerCaseFilter for Turkish language > ------------------------------------ > > Key: LUCENE-2102 > URL: https://issues.apache.org/jira/browse/LUCENE-2102 > Project: Lucene - Java > Issue Type: Improvement > Components: Analysis > Affects Versions: 3.0 > Reporter: Ahmet Arslan > Assignee: Robert Muir > Priority: Minor > Fix For: 3.1 > > Attachments: LUCENE-2102.patch, LUCENE-2102.patch, LUCENE-2102.patch, > LUCENE-2102.patch, LUCENE-2102.patch, LUCENE-2102.patch, LUCENE-2102.patch > > > java.lang.Character.toLowerCase() converts 'I' to 'i' however in Turkish > alphabet lowercase of 'I' is not 'i'. It is LATIN SMALL LETTER DOTLESS I. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org