Re: UCN (Java) notation beyond the BMP

2001-05-23 Thread Markus Scherer
[EMAIL PROTECTED] wrote: Is there a currently accepted format for Universal Character Names ... for the Unicode characters beyond U+? Not in Java. In C99, there is \U (8 hex digits). markus

UCN (Java) notation beyond the BMP

2001-05-22 Thread DougEwell2
Is there a currently accepted format for Universal Character Names (also known as Java escape sequences) for the Unicode characters beyond U+? As an example, I can use \u16f0 to get a Runic Belgthor, but I can't use \u10335 or anything like it to get a Gothic Qairthra*; as far as I know, I

Re: UCN (Java) notation beyond the BMP

2001-05-22 Thread addison
Could be because Java doesn't support these characters yet. I suspect that UTF-16 surrogate sequences are as close as you can get for now. As far a Java knows, you've got two surrogate characters and it doesn't know about the actual character value up there at Gothic Qairthra, or any of the

Re: UCN (Java) notation beyond the BMP

2001-05-22 Thread Michael \(michka\) Kaplan
From: [EMAIL PROTECTED] (JDK 1.4 will only add support for Unicode 3.0, not 3.1, although presumably it will add the infrastructure for surrogate pairs which will allow your question to get answered meaningfully!) Actually, no proper support for surrogate pairs means no support for *UNICODE

Re: UCN (Java) notation beyond the BMP

2001-05-22 Thread addison
I miswrote when I said infrastructure... there is clearly infrastructure to support the characters and for valid pairs (that is a low surrogate paired with a high surrogate being one character), but that's it. And that's as good as, say, Win2K can do at this point. The problem, of course, is