Philippe Verdy scripsit:

> [...] char, whose values are 16-bit unsigned integers
> representing Unicode characters (section 2.1).

Despite your ingenious special pleading, I don't see how this can mean
anything except that chars must be 16-bit unsigned integers.

> The Java language still lacks a way to specify a literal for a character out
> of the BMP. Of course one can use the syntax '\uD800\uDC00' but this would
> not compile with the current _compilers_, that expect only one char in the
> literal. In a String literal "\uD800\uDC00" becomes the 4-bytes UTF-8
> sequence for _one_ Unicode codepoint in the compiled class.

Character literals are crocky anyhow.  IMHO modern programming languages
should not have a Character type, but deal only in Strings.

> 2. The initial spec of UTF-32 and UTF-8 by ISO allowed much more planes with
> 31-bit codepoints, and may be there will be an agreement sometime in the
> future between ISO and Unicode to define new codepoints out of the current
> standard 17 first planes that can be safely converted with UTF-16, 

I doubt it very much.  17 planes is waaaay more than sufficient.

-- 
John Cowan  [EMAIL PROTECTED]  www.reutershealth.com  www.ccil.org/~cowan
Assent may be registered by a signature, a handshake, or a click of a computer
mouse transmitted across the invisible ether of the Internet. Formality
is not a requisite; any sign, symbol or action, or even willful inaction,
as long as it is unequivocally referable to the promise, may create a contract.
       --_Specht v. Netscape_

Reply via email to