RE: Encoding conversions

Carl W. Brown Mon, 10 Sep 2001 07:17:03 -0700
Roozbeh,

The approach that I use is that I will convert UCES to UTF-16 just by not
checking.  But the reverse will not happen nor will it convert properly to
UTF-32.

If they validate UTF-8 (xiua_ValidateStr) it will check each character to be
a valid UTF-8 initial character followed by the proper number of
continuation characters if any.  It will make sure that it is not a
surrogate character nor a reversed BOM nor exceed the Unicode 3.1 character
range.

Carl

> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED]]On Behalf Of Roozbeh Pournader
> Sent: Monday, September 10, 2001 3:24 AM
> To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
> Subject: RE: Encoding conversions
>
>
>
> On Sun, 9 Sep 2001, Carl W. Brown wrote:
>
> > > No, Oracle did not design UTF-8 at all. The RFC 2279 specifies UTF-8,
> > > and it encodes all characters from U+00000000 to U+7FFFFFFF.
> >
> > What I meant was the Oracle implementation of UTF-8.
>
> They are now calling it UCES-8, the only remain will now be an Oracle
> datatype named "UTF8" which is UCES-8 really.
>
> UTC is also working on restricting UTF-8 to something equivalent to
> RFC 2279's definition (well, for the range U+0000 to U+10FFFF) in Unicode
> 3.2. That's very good news I think.
>
> roozbeh
>
> -
> Linux-UTF8:   i18n of Linux on all levels
> Archive:      http://mail.nl.linux.org/linux-utf8/

-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/
RE: Encoding conversions

Reply via email to