Roozbeh,
The approach that I use is that I will convert UCES to UTF-16 just by not
checking. But the reverse will not happen nor will it convert properly to
UTF-32.
If they validate UTF-8 (xiua_ValidateStr) it will check each character to be
a valid UTF-8 initial character followed by the proper number of
continuation characters if any. It will make sure that it is not a
surrogate character nor a reversed BOM nor exceed the Unicode 3.1 character
range.
Carl
> -----Original Message-----
> From: [EMAIL PROTECTED]
> [mailto:[EMAIL PROTECTED]]On Behalf Of Roozbeh Pournader
> Sent: Monday, September 10, 2001 3:24 AM
> To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
> Subject: RE: Encoding conversions
>
>
>
> On Sun, 9 Sep 2001, Carl W. Brown wrote:
>
> > > No, Oracle did not design UTF-8 at all. The RFC 2279 specifies UTF-8,
> > > and it encodes all characters from U+00000000 to U+7FFFFFFF.
> >
> > What I meant was the Oracle implementation of UTF-8.
>
> They are now calling it UCES-8, the only remain will now be an Oracle
> datatype named "UTF8" which is UCES-8 really.
>
> UTC is also working on restricting UTF-8 to something equivalent to
> RFC 2279's definition (well, for the range U+0000 to U+10FFFF) in Unicode
> 3.2. That's very good news I think.
>
> roozbeh
>
> -
> Linux-UTF8: i18n of Linux on all levels
> Archive: http://mail.nl.linux.org/linux-utf8/
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/