[ 
https://issues.apache.org/jira/browse/LUCENE-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12779319#action_12779319
 ] 

Simon Willnauer commented on LUCENE-2069:
-----------------------------------------

bq. Simon, those "wierd" chars are indeed real codepoints that have lowercasing 
behavior in Unicode 4.0! 
thats what I guessed :D otherwise it would not work though :). I was just 
wondering if there are some more expressive once out there.

bq. Mark, true, well give me some consensus so when 3.0 is released, we can 
start attacking these issues! 
+1

> fix LowerCaseFilter for unicode 4.0
> -----------------------------------
>
>                 Key: LUCENE-2069
>                 URL: https://issues.apache.org/jira/browse/LUCENE-2069
>             Project: Lucene - Java
>          Issue Type: Improvement
>          Components: Analysis
>            Reporter: Robert Muir
>            Priority: Minor
>             Fix For: 3.1
>
>         Attachments: LUCENE-2069.patch, LUCENE-2069.patch, LUCENE-2069.patch
>
>
> lowercase suppl. characters correctly. 
> this only fixes the filter, the LowerCaseTokenizer is part of a more complex 
> issue (CharTokenizer)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to