Nathan Beyer wrote:
Alright, more specifically a char is no longer guaranteed to be a single
Unicode character, but rather two bytes of a UTF-16 sequence.
Agree :-)
-----Original Message-----
From: Richard Liang [mailto:[EMAIL PROTECTED] Sent: Monday, March 06, 2006 8:10 PM
To: [email protected]
Subject: Re: J2SE 5 Plan / Roadmap

Nathan Beyer wrote:
Maybe I didn't read this article right, but my understanding is that in
Java
5, char or at least an array of char is considered a UTF-16 sequence:
http://java.sun.com/developer/technicalArticles/Intl/Supplementary/
Yes, Nathan. Char sequence and char array are considered as UTF-16 sequence. But as a data type, 'char' cannot be UTF-16. How can you assign a char as a Unicode character U+10FFFF?
Here's the important quote from this article:

"In the end, the decision was for a tiered approach:
    * Use the primitive type int to represent code points in low-level
APIs,
such as the static methods of the Character class.
    * Interpret char sequences in all forms as UTF-16 sequences, and
promote
their use in higher-level APIs.
    * Provide APIs to easily convert between various char and code
point-based representations."



-----Original Message-----
From: Richard Liang [mailto:[EMAIL PROTECTED] Sent: Monday, March 06, 2006 7:34 PM
To: [email protected]
Subject: Re: J2SE 5 Plan / Roadmap

Nathan Beyer wrote:
Here's a good link with a summary of some of the new features:
http://java.sun.com/developer/technicalArticles/releases/j2se15/

One of the major pieces that I missed is uplifting everything to support
the
Unicode 4.0 updates; char is now UTF-16 encoded value, etc.
'char' cannot be UTF-16 encoded value :-) As you know, char is 16-bit value in Java. However, UTF-16 encoding algorithm is: For characters in range U+0000 - U+FFFF (excluding U+D800 - U+DFFF) just use the same unsigned 16-bit numeric value as their UTF-16 encoded value; For example, the UTF-16 encoded value for character '\u4E2D' is 0x4E2D. For characters in U+10000 - U+10FFFF use a surrogate pair as their UTF-16 encoded value. For example, the UTF-16 encoded value for character U+10000 is a 32-bit value 0xD800 0xDC00.

Please refer to "Glossary of Unicode Terms"
http://www.unicode.org/glossary/
Also please forgive my prolixity :-)
-----Original Message-----
From: Nathan Beyer [mailto:[EMAIL PROTECTED]
Sent: Friday, March 03, 2006 6:19 PM
To: [email protected]
Subject: RE: J2SE 5 Plan / Roadmap

This can probably be split up into a couple pieces; here are some of my
thoughts.

* JLS3 bytecode support - Can the current VM load JLS3 class files? If
not,
what's missing (besides the major.minor version header)? Can some
features
be enabled/uplifted to allow APIs to move forward? Since generics are a
compile-time feature, can code begin using generics by just allowing the
JLS3 major.minor version? In any case, this probably breaks down into a
couple topics: generics, annotations, enums, return-value-covariance and
others.

* Java 5 APIs [NEW] - Most of the brand-new APIs have been identified
(java.lang.management, etc) and some are already being checked in. It
seems
like there's already a undocumented process here, which is to uplift and
build any APIs to the Java 5 specification and leaving out any Java
5-specific language features, like generification, enums and
annotations.
* Java 5 APIs [Generfication] - Once we can allow code with generics to
be
compiled and loaded, then we could begin a phase of uplifting all of the
newly generified APIs (Collections, etc).

* Java 5 APIs [Annotations] - I suspect, but could be wrong, that
Annotations support will be more difficult to implement in the VM, as
there
is some runtime support. When this is available, another phase can be
initiated to uplift all of the APIs with the appropriate Annotations
(@deprecated, @override, etc).

I'm not too familiar with the JAPI tool, but if Stuart can get the
Harmony
reports to run against the Java 5 specifications, I this would provide
an
enormous boost in the arena of new Java 5 APIs. Does JAPI support JLS3
features, like generics, annotations and enum checking?

Those are some of my thoughts at least.

BTW: JLS3 == Java Language Specification, 3rd Edition

-Nathan

-----Original Message-----
From: Geir Magnusson Jr [mailto:[EMAIL PROTECTED]
Sent: Friday, March 03, 2006 9:40 AM
To: [email protected]
Subject: J2SE 5 Plan / Roadmap

We should probably start thinking about how we'll move forward.  There
have been a few casual conversations, but lets start getting some
things
down here.




--
Richard Liang
China Software Development Lab, IBM

Reply via email to