[ https://issues.apache.org/jira/browse/LUCENE-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779319#action_12779319 ]
Simon Willnauer commented on LUCENE-2069: ----------------------------------------- bq. Simon, those "wierd" chars are indeed real codepoints that have lowercasing behavior in Unicode 4.0! thats what I guessed :D otherwise it would not work though :). I was just wondering if there are some more expressive once out there. bq. Mark, true, well give me some consensus so when 3.0 is released, we can start attacking these issues! +1 > fix LowerCaseFilter for unicode 4.0 > ----------------------------------- > > Key: LUCENE-2069 > URL: https://issues.apache.org/jira/browse/LUCENE-2069 > Project: Lucene - Java > Issue Type: Improvement > Components: Analysis > Reporter: Robert Muir > Priority: Minor > Fix For: 3.1 > > Attachments: LUCENE-2069.patch, LUCENE-2069.patch, LUCENE-2069.patch > > > lowercase suppl. characters correctly. > this only fixes the filter, the LowerCaseTokenizer is part of a more complex > issue (CharTokenizer) -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online. --------------------------------------------------------------------- To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org For additional commands, e-mail: java-dev-h...@lucene.apache.org