Re: CJK Unification

Markus Kuhn Fri, 30 Mar 2001 02:52:05 -0800
PILCH Hartmut wrote on 2001-03-30 08:54 UTC:
> Putting the 'A' of latin, cyrillic and greek at the same codepoint
> would not have necessarily been wrong.  It just wasn't done, because
> there are so few alphabetic characters, and lining them up in different
> codespaces makes it easier to implement sorting and much more.

It was not done for exactly one single reason: Many existing source
standards distinguished between the Latin, Cyrillic, and Greek "A" (ISO
8859-5, ISO 8859-7, JIS X0208, to name just a few examples). Therefore
the round-trip compatibility requirement has forbidden the unification
of the Latin, Greek and Cyrillic alphabets. Where roundtrip
compatibility was not an issue, small alphabets have been unified
vigorously in UCS. Example: Coptic and Greek, TeX and APL, etc. It is a
highly consistent design, once you take all the design considerations
into account.

There are by the way coded character sets that did unify the Greek and
Latin alphabet, such as TeX and GSM 03.38 <http://www.unicode.org/Public/
MAPPINGS/ETSI/GSM0338.TXT>

Markus

-- 
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org,  WWW: <http://www.cl.cam.ac.uk/~mgk25/>

-
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/lists/
Re: CJK Unification

Reply via email to