mir Mehrotra,
i-flex Solutions Limited,
a CitiCorp venture capital company
at SEI-CMM level 5.
[EMAIL PROTECTED]
-Original Message-
From: John H. Jenkins [SMTP:[EMAIL PROTECTED]]
Sent: Tuesday, July 25, 2000 8:12 AM
To: Unicode List
Subject:Re: Oracle and Surr
Cherlin [mailto:[EMAIL PROTECTED]]
Sent: Saturday, July 29, 2000 2:33 PM
To: Unicode List
Subject: Re: FW: Oracle and Surrogate Pairs
At 2:41 AM -0800 7/25/2000, [EMAIL PROTECTED] wrote:
Hi all,
I have been developing/convering a software to support multiple
languages, especially Japanese
TED]]
Sent: Tuesday, July 25, 2000 8:12 AM
To: Unicode List
Subject: Re: Oracle and Surrogate Pairs
Does the field in question need to support literally any possible
character
in Unicode 3.0 and beyond (since 3.0 does not have any surrogates
assigned!)?
True, but
As Oracle UTF8 character set definition supports surrogates by a pairs of
two
3-bytes to be sync with UTF-16 in binary sorting and code point,
This in not a conformant representation.
D29 (p. 46) states that a UTF "transforms each Unicode scalar value into a
unique sequence of code values". Am
You could define a UTF that mapped scalar values below to the same as
UTF-8, and values above to a 6 byte value. It would *not* be UTF-8, but it
can be well defined.
If you look below D29 -- p. 46 at the first full paragraph -- you find that for
round tripping, UTFs are required to map
Title: Oracle and Surrogate Pairs
What is the correct way of supporting surrogate pairs in Oracle 8? Anything wrong with approach of making fields 3 times longer from ASCII or should fields be 4 times ASCII as per UTF-8 spec?
Later,
Mikko
Globalization Specialist
Onyx Software
[EMAIL
MAIL PROTECTED]
Sent: Monday, July 24, 2000 4:28 PM
Subject: Oracle and Surrogate Pairs
What is the correct way of supporting surrogate pairs in Oracle 8?
Anything
wrong with approach of making fields 3 times longer from ASCII or should
fields be 4 times ASCII as per UTF-8 spec?
Later,
of supporting surrogate pairs
in Oracle 8? Anything wrong with approach of making fields 3 times longer
from ASCII or should fields be 4 times ASCII as per UTF-8 spec?
Later,
Mikko
Globalization Specialist
Onyx Software
[EMAIL PROTECTED]
www.onyx.com
425.519.4172
begin:vcard
n:Yang;Jianping
tel
, July 24, 2000 5:08
PM
To: Mikko Lahti
Cc: Unicode List
Subject: Re: Oracle and Surrogate
Pairs
Mikko,
As there is no character defined in
surrogate range in Unicode 3.0, the maximum width for Oracle UTF8 character set
is 3 bytes. Here I recommend you to use 3 times for the number of
characters
Does the field in question need to support literally any possible character
in Unicode 3.0 and beyond (since 3.0 does not have any surrogates
assigned!)?
True, but within a year or so, there *will* be surrogates assigned in Unicode. One
cannot be premature in supporting them at
10 matches
Mail list logo