On Thursday 10 January 2002 02:43 pm, you wrote:
> Hi,
>
> At Thu, 10 Jan 2002 13:40:09 -0800,
>
> Edward Cherlin wrote:
> > > (Unfortunately, Microsoft Japanese fonts don't *have* a
> > > single-width backslash *at all*, which means terminal
> > > emulators--which typically don't want to deal with multiple
> > > fonts--are hard pressed to do anything like this at all.
> > > Grrr.)
> >
> > These broken fonts must be replaced. That is the only way to get
> > proper display of Japanese using Japanese fonts in Unicode under
> > Windows. Unfortunately, they exist because some Japanese claim
> > that they are *necessary* for proper display of Japanese in
> > Unicode, which is nonsense.
>
> Yes, I think this is a mess. However, Microsoft cannot change it.
> For example, I can write
> "the cost is \100 and the file is C:\text\abc.txt" or,
How is such code executed, then? It appears severely broken. No
compiler can tell from this code fragment which is supposed to be
which, since \100 is a legitimate filespec in Windows.
> printf("The cost is \\100.\n");
> Here, "\" in "\100" or "\\100" means yen sign (and you may think
> it should be mapped into U+00A5),
Exactly. I can't think of any other possibility. Do that and the
whole problem goes away, so we can fix the fonts. You don't need the
fonts fixed first, so it's up to the Japanese programming community
to get their heads on straight and face up to their responsibilities.
Otherwise we will have to declare the Japanese reputation for quality
a myth.
> while the codepoint of U+005C
> is proper for "\" in file name or "\n". We cannot transcode such
> strings in automatic way.
I doubt that, assuming that the compiler can tell what you want done.
I think my son in college could code a Perl script for any particular
programming language you need translated. If you need more help, I
can get hold of some of the people who handled the date
identification mess in all of the varieties of source code during the
Y2K cleanup.
> Thus, Microsoft chose a way that
> "\" is mapped into U+005C and the glyph for U+005C is yen sign.
> I heard that Java also have such mapping table.
If so, they are also part of the problem, rather than part of the
solution.
> I agree this is a severe violation of Unicode standard but I don't
> know any clean solution.
Fixing the source code at the source is a lot cleaner than inflicting
your "fix" on the rest of the world. It's as bad as Oracle's attempt
to define a standard for its variant UTF-8 (CESU-8, which apparently
should be pronounced 'sezyu' in English). Their stated reason is the
same, that it's too much work to fix all of their databases, and
their cure is to lay even more work off on the rest of the world.
> ---
> Tomohiro KUBOTA <[EMAIL PROTECTED]>
> http://www.debian.or.jp/~kubota/
> "Introduction to I18N"
> http://www.debian.org/doc/manuals/intro-i18n/
--
Edward Cherlin
[EMAIL PROTECTED]
Does your Web site work?
--
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/