Alright, more specifically a char is no longer guaranteed to be a single Unicode character, but rather two bytes of a UTF-16 sequence.
-----Original Message----- From: Richard Liang [mailto:[EMAIL PROTECTED] Sent: Monday, March 06, 2006 8:10 PM To: [email protected] Subject: Re: J2SE 5 Plan / Roadmap Nathan Beyer wrote: > Maybe I didn't read this article right, but my understanding is that in Java > 5, char or at least an array of char is considered a UTF-16 sequence: > http://java.sun.com/developer/technicalArticles/Intl/Supplementary/ > Yes, Nathan. Char sequence and char array are considered as UTF-16 sequence. But as a data type, 'char' cannot be UTF-16. How can you assign a char as a Unicode character U+10FFFF? > Here's the important quote from this article: > > "In the end, the decision was for a tiered approach: > * Use the primitive type int to represent code points in low-level APIs, > such as the static methods of the Character class. > * Interpret char sequences in all forms as UTF-16 sequences, and promote > their use in higher-level APIs. > * Provide APIs to easily convert between various char and code > point-based representations." > > > > -----Original Message----- > From: Richard Liang [mailto:[EMAIL PROTECTED] > Sent: Monday, March 06, 2006 7:34 PM > To: [email protected] > Subject: Re: J2SE 5 Plan / Roadmap > > Nathan Beyer wrote: > >> Here's a good link with a summary of some of the new features: >> http://java.sun.com/developer/technicalArticles/releases/j2se15/ >> >> One of the major pieces that I missed is uplifting everything to support >> > the > >> Unicode 4.0 updates; char is now UTF-16 encoded value, etc. >> >> > 'char' cannot be UTF-16 encoded value :-) As you know, char is 16-bit > value in Java. However, UTF-16 encoding algorithm is: > For characters in range U+0000 - U+FFFF (excluding U+D800 - U+DFFF) just > use the same unsigned 16-bit numeric value as their UTF-16 encoded > value; For example, the UTF-16 encoded value for character '\u4E2D' is > 0x4E2D. > For characters in U+10000 - U+10FFFF use a surrogate pair as their > UTF-16 encoded value. For example, the UTF-16 encoded value for > character U+10000 is a 32-bit value 0xD800 0xDC00. > > Please refer to "Glossary of Unicode Terms" http://www.unicode.org/glossary/ > > Also please forgive my prolixity :-) > >> >> >>> -----Original Message----- >>> From: Nathan Beyer [mailto:[EMAIL PROTECTED] >>> Sent: Friday, March 03, 2006 6:19 PM >>> To: [email protected] >>> Subject: RE: J2SE 5 Plan / Roadmap >>> >>> This can probably be split up into a couple pieces; here are some of my >>> thoughts. >>> >>> * JLS3 bytecode support - Can the current VM load JLS3 class files? If >>> not, >>> what's missing (besides the major.minor version header)? Can some >>> > features > >>> be enabled/uplifted to allow APIs to move forward? Since generics are a >>> compile-time feature, can code begin using generics by just allowing the >>> JLS3 major.minor version? In any case, this probably breaks down into a >>> couple topics: generics, annotations, enums, return-value-covariance and >>> others. >>> >>> * Java 5 APIs [NEW] - Most of the brand-new APIs have been identified >>> (java.lang.management, etc) and some are already being checked in. It >>> seems >>> like there's already a undocumented process here, which is to uplift and >>> build any APIs to the Java 5 specification and leaving out any Java >>> 5-specific language features, like generification, enums and annotations. >>> >>> * Java 5 APIs [Generfication] - Once we can allow code with generics to >>> > be > >>> compiled and loaded, then we could begin a phase of uplifting all of the >>> newly generified APIs (Collections, etc). >>> >>> * Java 5 APIs [Annotations] - I suspect, but could be wrong, that >>> Annotations support will be more difficult to implement in the VM, as >>> there >>> is some runtime support. When this is available, another phase can be >>> initiated to uplift all of the APIs with the appropriate Annotations >>> (@deprecated, @override, etc). >>> >>> I'm not too familiar with the JAPI tool, but if Stuart can get the >>> > Harmony > >>> reports to run against the Java 5 specifications, I this would provide an >>> enormous boost in the arena of new Java 5 APIs. Does JAPI support JLS3 >>> features, like generics, annotations and enum checking? >>> >>> Those are some of my thoughts at least. >>> >>> BTW: JLS3 == Java Language Specification, 3rd Edition >>> >>> -Nathan >>> >>> >>> >>>> -----Original Message----- >>>> From: Geir Magnusson Jr [mailto:[EMAIL PROTECTED] >>>> Sent: Friday, March 03, 2006 9:40 AM >>>> To: [email protected] >>>> Subject: J2SE 5 Plan / Roadmap >>>> >>>> We should probably start thinking about how we'll move forward. There >>>> have been a few casual conversations, but lets start getting some things >>>> down here. >>>> >>>> >> >> > > > -- Richard Liang China Software Development Lab, IBM
