Addison Phillips [wM] wrote:

> ICU4J, the IBM opensource project, provides some UTF-16 support capabilities that 
>suggest a possible solution, but there are seemingly intractable problems with the 
>Character class and char data type (luckily most APIs in Java take int arguments for 
>characters instead of char). And it is pretty easy to build classes for processing 
>these characters as surrogate pairs using the Unicode character database.


(Late for this thread.)

ICU4J comes with its own "UCharacter" class that provides Unicode 3.1.1 properties for 
all code points, using int for the single-character type.
A class library can of course not fix the problem of string literals with \u - we 
either use two \u's for surrogate pairs or an unescape function (I think on the UTF16 
class) that understands \U.

http://oss.software.ibm.com/icu4j/

markus


Reply via email to