Markus Kuhn wrote on 2001-06-15 22:07 UTC:
> - You might have thought that supporting ISO 2022 locales is
> incompatible with __STDC_ISO_10646__. First of all, ISO 2022
> is *not* at all suitable for use in Unix locales anyway, so the
> issue should really be irrelevant. But just for the sake of argument,
> assuming you still really have to use it for whatever reasons,
> you will find that the UCS-4 private use areas are more than big
> enough to map all registered ISO 2022/ISO 2375 encodings into them.
>
> Essentially, you store character X from ISO-IR encoding number Y
> as
>
> (wchar_t) (0x60000000 + y * 0x200000 + Conv_ISO_IR_Y_to_UCS(X))
I've just refined that proposal a bit, found that it is not even
necessary to use the ISO IR number, but that you can practically squeeze
the entire designator sequence into the high bits of wchar_t. This then
covers even private use designators. Things have also become clearer by
giving up the notion of handling this as UCS private use characters.
The gory details are now on
http://www.cl.cam.ac.uk/~mgk25/ucs/iso2022-wc.html
Just to have a URL for the idea, for further reference. Please have a
look at it. Comments welcome!
Markus
--
Markus G. Kuhn, Computer Laboratory, University of Cambridge, UK
Email: mkuhn at acm.org, WWW: <http://www.cl.cam.ac.uk/~mgk25/>
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/