In a message dated 2001-06-19 10:36:40 Pacific Daylight Time,
[EMAIL PROTECTED] writes:
I agree with you, the problem is that the D800 to DFFF codes were never
defined as valid Unicode characters.
True; there were never characters assigned into these positions.
Encoding these into ED xx
In a message dated 2001-06-19 6:46:14 Pacific Daylight Time,
[EMAIL PROTECTED] writes:
If you take the original UCS-2 to UTF-8 mechanism
(back when UTF-8 was called UTF-FSS) and apply it to surrogates, the
sequence D800 DC00 would map to the sequence ED A0 80 ED B0 80.
Very true:
U+D800
From: [EMAIL PROTECTED]
Waiting until characters were assigned
outside the BMP to start working on the UCS-2 problem is like waiting
until
2000-01-01 to start working on the Y2K problem.
Its actually a bit worse than this -- its coming up with a solution to Y2K
problems that requires other
Mark,
This is too strong a statement. Yes, UTF-FSS was designed to
represent code
points above in 4 bytes. But let's look at the path that the software
would take over history. If you take the original UCS-2 to UTF-8 mechanism
(back when UTF-8 was called UTF-FSS) and apply it to
Mark Davis wrote:
You are correct about the published definitions. As I recall, though, we
were referring to UTF-FSS as UTF-8 in the UTC meetings before it was changed
to account for UTF-16.
In any event, I don't know whether Oracle was involved in those discussions
or not, or whether
Jianping,
It's a reasonable set of requirements you laid out. However,
with respect to this last paragraph, as Unicode 3.1 did not
exist when 8i was current, is it not unreasonable to insist
that users wanting to work with 3.1, or in particular supplementary
characters, first must upgrade?
There is one statement that appears to want to be framed:
Jianping Yang wrote:
[...] When Unicode
came to version 2.1, we found our AL24UTFFSS had trouble for 2.1 as Hangul's
reallocation, and we could not simply update AL24UTFFSS to 2.1 definition as it
would mess existing users' data in
Markus Scherer wrote:
This means that Oracle mis-implemented the UTF-8 standard as it was specified at
that time, starting at least with Unicode 2.0.
No, Oracle does not mis-implement the UTF-8 standard but only limit its support to BMP
only. Except the backward compatibility reason,
UTF8 before or after the
definition was changed.
Mark
- Original Message -
From: Kenneth Whistler [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Sent: Tuesday, June 12, 2001 12:44
Subject: FSS-UTF, UTF-2, UTF-8, and UTF-16
Mark said:
UTF-8 was defined
Mark said:
UTF-8 was defined before UTF-16. At the time it was first defined, there
were no surrogates, so there was no special handling of the D800..DFFF code
points.
Technically, the first statement is not true.
UTF-2 and FSS-UTF *were* defined well before UTF-16. FSS-UTF was
defined on
10 matches
Mail list logo