From:
http://java.sun.com/javase/technologies/core/basic/intl/faq.jsp#core-textrep

 The primitive data type char in the Java programming language is an
unsigned 16-bit integer that can represent a Unicode code point in the
range U+0000 to U+FFFF, or the code units of
UTF-16<http://java.sun.com/javase/technologies/core/basic/intl/faq.jsp#utf-16>
.

Confusing eh?
I guess you would call what Java uses internally a UTF-16 subset.

So, technically not UTF-16, but practically UTF-16  (a two-byte UTF-16
subset)


On Tue, Aug 28, 2012 at 11:18 AM, Paul Gilmartin <[email protected]>wrote:

> On Tue, 28 Aug 2012 10:39:03 -0500, Kirk Wolf wrote:
> >
> >UTF-16 is used in Java (and other languages) as the internal
> representation
> >of characters and strings (each character represented by two bytes).
> >
> No.  Not according to:
>
>     http://en.wikipedia.org/wiki/UTF-16
>
>     UTF-16 (16-bit Unicode Transformation Format) is a character encoding
> for
>     Unicode capable of encoding 1,112,064[1] numbers (called code points)
> in the
>     Unicode code space from 0 to 0x10FFFF. It produces a variable-length
> result
>     of either one or two 16-bit code units per code point.
>
> And:
>
>     http://www.ietf.org/rfc/rfc2781.txt
>
>    The rules for how characters are encoded in UTF-16 are:
>
>    -  Characters with values less than 0x10000 are represented as a
>       single 16-bit integer with a value equal to that of the character
>       number.
>
>    -  Characters with values between 0x10000 and 0x10FFFF are
>       represented by a 16-bit integer with a value between 0xD800 and
>       0xDBFF (within the so-called high-half zone or high surrogate
>       area) followed by a 16-bit integer with a value between 0xDC00 and
>       0xDFFF (within the so-called low-half zone or low surrogate area).
>
>    -  Characters with values greater than 0x10FFFF cannot be encoded in
>       UTF-16.
>
> -- gil
>
> ----------------------------------------------------------------------
> For IBM-MAIN subscribe / signoff / archive access instructions,
> send email to [email protected] with the message: INFO IBM-MAIN
>

----------------------------------------------------------------------
For IBM-MAIN subscribe / signoff / archive access instructions,
send email to [email protected] with the message: INFO IBM-MAIN

Reply via email to