Supplementary Character Handling in CharTokenizer
-------------------------------------------------

                 Key: LUCENE-2183
                 URL: https://issues.apache.org/jira/browse/LUCENE-2183
             Project: Lucene - Java
          Issue Type: Improvement
          Components: Analysis
            Reporter: Simon Willnauer
             Fix For: 3.1


CharTokenizer is an abstract base class for all Tokenizers operating on a 
character level. Yet, those tokenizers still use char primitives instead of int 
codepoints. CharTokenizer should operate on codepoints and preserve bw 
compatibility. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.


---------------------------------------------------------------------
To unsubscribe, e-mail: java-dev-unsubscr...@lucene.apache.org
For additional commands, e-mail: java-dev-h...@lucene.apache.org

Reply via email to