Hi, At Fri, 11 Jan 2002 04:51:35 -0800, Edward Cherlin wrote:
> > For example, I can write > > "the cost is \100 and the file is C:\text\abc.txt" or, > > How is such code executed, then? It appears severely broken. No > compiler can tell from this code fragment which is supposed to be > which, since \100 is a legitimate filespec in Windows. This is not a code. Assume this is a message for human's reading. > Fixing the source code at the source is a lot cleaner than inflicting > your "fix" on the rest of the world. It's as bad as Oracle's attempt > to define a standard for its variant UTF-8 (CESU-8, which apparently > should be pronounced 'sezyu' in English). Their stated reason is the > same, that it's too much work to fix all of their databases, and > their cure is to lay even more work off on the rest of the world. At first, this problem affect not only source codes but also many texts of end users. You can easily imagine text files of end users contain many "\" as currency sign AND many "\" as a element of file names. Even if you may success to persuade every Japanese Windows programmers to modify their source codes, you won't be successful to persuade Japanese business users to modify their files like accounts.xls . In case of Oracle, the problem was limited in the _internal_ encoding of the database (which end users don't care) and the end users can be free from feeling any trouble, if Oracle does a good work. And more, conversion from CESU-8 to correct UTF-8 can be done using simple algorithm. On the other hand, the meaning of "\" depends on context and, ultimately, only the writer of the "\" knows whether it should be U+005C or U+00A5. --- Tomohiro KUBOTA <[EMAIL PROTECTED]> http://www.debian.or.jp/~kubota/ "Introduction to I18N" http://www.debian.org/doc/manuals/intro-i18n/ -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
