Re: Unicode, character ambiguities

Tomohiro KUBOTA Fri, 11 Jan 2002 05:39:36 -0800

Hi,

At Fri, 11 Jan 2002 04:51:35 -0800,
Edward Cherlin wrote:


> > For example, I can write
> > "the cost is \100 and the file is C:\text\abc.txt" or,
> 
> How is such code executed, then? It appears severely broken. No 
> compiler can tell from this code fragment which is supposed to be 
> which, since \100 is a legitimate filespec in Windows.

This is not a code.  Assume this is a message for human's reading.


> Fixing the source code at the source is a lot cleaner than inflicting 
> your "fix" on the rest of the world. It's as bad as Oracle's attempt 
> to define a standard for its variant UTF-8 (CESU-8, which apparently 
> should be pronounced 'sezyu' in English). Their stated reason is the 
> same, that it's too much work to fix all of their databases, and 
> their cure is to lay even more work off on the rest of the world.

At first, this problem affect not only source codes but also
many texts of end users.  You can easily imagine text files
of end users contain many "\" as currency sign AND many "\"
as a element of file names.  Even if you may success to persuade
every Japanese Windows programmers to modify their source codes,
you won't be successful to persuade Japanese business users to
modify their files like accounts.xls .

In case of Oracle, the problem was limited in the _internal_
encoding of the database (which end users don't care) and the
end users can be free from feeling any trouble, if Oracle does
a good work.  And more, conversion from CESU-8 to correct UTF-8
can be done using simple algorithm.  On the other hand, the
meaning of "\" depends on context and, ultimately, only the
writer of the "\" knows whether it should be U+005C or U+00A5.

---
Tomohiro KUBOTA <[EMAIL PROTECTED]>
http://www.debian.or.jp/~kubota/
"Introduction to I18N"  http://www.debian.org/doc/manuals/intro-i18n/
--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

Re: Unicode, character ambiguities

Reply via email to