Re: Unicode is optimal for Chinese/Japanese multilingual texts

Markus Kuhn Sat, 07 Apr 2001 09:02:49 -0700
Tomohiro KUBOTA wrote on 2001-04-07 15:10 UTC:
> Markus Kuhn <[EMAIL PROTECTED]> wrote:
> 
> > Han unification is far more than just a technological restriction
> > manifested in ISO 10646.
> 
> Don't you know that this discussion is about glyph, not character?

I am confident that I understand the issues perfectly well after 1.5
years of discussion here and your replies show me that you frequently do
not understand all of what I wrote and draw false conclusions from the
fragments that you believe you did understand. I acknowledge that there
is a language barrier affecting our communication, which is why I try to
very patiently repeat and rephrase my statements again and again. I hope
that you will reward my efforts by seriously trying to read my writings
carefully.

> I thought you have admitted that Japanese text must be written
> using Japanese glyph.  I am tired to insist same thing again
> and again in foreign language...
> 
> If you really want to insist unification of Japanese and Chinese
> glyph, please insist unification of Latin and Cyrillic glyph also.

I never insisted on unification of Japanese and Chinese glyph styles. I
only said that Han glyph unification is an idea that has been around for
a long time and that seems historically and culturally like a reasonable
idea to me. I have personally absolutely no interest either for or
against Han glyph unification. I do not use the Han script, therefore
the question whether Han glyph unification is good or bad has never been
any of my personal or political business. I also said that Han coded
character set unification is a technological requirement before Han
glyph unification can become possible at all. UCS is the world's first
widely deployed coded character set that is suitable for both continuing
existing practice *and* for Han glyph unification (if you want it). None
of the older JIS, GB, KS, etc. standards is suitable for Han glyph
unification.

UCS just enables Han glyph unification by providing Han character
unification. It does not prescribe Han glyph unification in any way, and
neither do I. Note that ISO 10646 was not printed in a unified Han glyph
style. ISO 10646 was even printed using five (!) different Han font
style columns. UCS font designers can now either stick with one of these
five example glyph styles or they can use their artistic taste to blend
these styles together in arbitrary typographically pleasing ways.

While I do not care about Han glyph unification, I do care a great deal
about the usefulness and simplicity of the plaintext infrastructure that
we provide with POSIX implementations. I am convinced that UCS is an
excellent substitute for all other existing coded character sets. I am
also convinced that code-set independent programming either
significantly adds to software complexity or significantly reduces
commonly implemented functionality (example: the broken search functions
in MULE-Emacs).

I believe code-set independent programming is a clear waste of time,
money and intellectual resources. I also observe undisputable signs that
the support for legacy character sets has significantly improved with
the ongoing hardwiring to UCS. Example: I read/write email using the
exmh software which is implemented in TCL/Tk. Two years ago, Shift-JIS/
EUC-JP/ISO-2022-JP/etc. support required a huge and enormously complex
patch to exmh that whas available but that no sane person in Europe
would ever install. Today, TCL/Tk uses UCS as its only character set
internally and provides the usual conversion library functions based on
the Unicode mapping tables. With *very* little change, an updated exmh
is now able to provide the full range of CJK and European legacy
character set support. This was possible thanks to the hardwiring of TCL
strings to Unicode! There are numerous other success stories like this.

I am also convinced that if Japanese users have an urgent need to switch
glyph styles within multi-lingual documents, they should use a proper
font selection mechanism. The Unicode language tags are a bad compromise
intended mostly to help stubborn Japanese ISO 2022 fans to see how their
mental model of plaintext processing could be preserved in the UCS
world. They are not a proper solution, but they are a significant
improvement over ISO 2022. They should not really belong into a
character set standard but should be left to higher-layer protocols and
I am glad to see that the HTML folks banned them quickly from their
format.

I hope this helps you to understand much better what I was talking about
all the time ...

Markus

-- 
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>

-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/lists/
Re: Unicode is optimal for Chinese/Japanese multilingual texts

Reply via email to