RE: Linux console internationalization

Kent Karlsson Thu, 14 Aug 2003 12:59:22 -0700

Beni Cherniavsky wrote:
> Kent Karlsson wrote on 2003-08-12:
> 
> > By and large, "decomposed" vs. "precomposed" only makes sense
> > for Latin, Greek, Cyrillic, Hangul, and Hiragana/Katakana.
> > There are precomposed letters also for other scripts, but mostly
they
> > are not to be recommended.
> >
> Hebrew also has quite a lot precomposed characters and they are not
> recommended either (in fact they are very annoying because xterm
> greedily composes them but most fonts lack them, so you get empty
> boxes for a lot consonant + vowel combinations but not for others...).


I have no idea why xterm does such a thing. Note that those letters have
canonical decompositions, but they are "composition excluded" for the
Unicode normal forms NFC and NFKC (so it cannot be part of Unicode
normalisation).

> > Why do something entirely different for the "console". Why not adapt
> > XKB so that it, and its data files, can work for the "console" too?
> > (Likewise for an input method mechanism (XIM??).)
> >
> XKB might or might not be a good choice.  What's sure is that the
> world needs less keymap formats.  Here are open-source systems and
> applications I can now remember, each having a keymap format:
> 
> - X: xmodmap and XKB
> - Linux console
> - Emacs (leim)
> - VIM
> - Lyx
> - TeXmacs
> - Yudit
> - mined
> - Geresh
> - Allegro (gaming library)

Eeeh, I was kind of hoping to get away with dealing with just one 
keyboard layout description format per system out of Windows,
MacOS X, and Unix/Linux (w. X)...

...
> So back to XKB.  It's powerful enough to handle almost any need.  It's
> formats are not ideal though.  It has some X cruft, like limitation to
> 4 groups, resolution to X keysyms instead of Unicode, and general

IIUC, UXXXX for any four-digit hexadecimal value XXXX is a predefined
keysym value mapping to character U+XXXX. (Not sure if it can handle
more than four digits, for the supplementary plane characters.)

> complexity, like custom names for all physical key names.  The Linux
> console and Allegro also suffer from this disease.  Tell me, why do I
> have to remember that the key is named `<TLDE>` in XKB and the
> resulting value is name `asciitilde` in the linux console, when I
> could have written ``~`` for both?

That's really key AE00 and often generates some other character than
TILDE, e.g.:

    key <TLDE>  { [   section,    onehalf,    paragraph, threequarters]
};

> I think that most key mapping tasks can be done simply as a sequence
> of mappings on unicode strings, applied one after the other.  So the
> basic mapping from scancodes would uses the well-known qwerty names,

No, thanks! (Why should one have to learn the US keyboard layout to deal
with these things??)

> like ``q``, and following layers would translate it to a non-qwerty
> layout (if needed).  This way, the amount of arbitrary names in the
> system is minimized, easying re-use in other environments.

No, not at all. While some of the key names used in XKB layout files
are ill-chosen, many of them "come from" ISO 9995; and that's good.

                /kent k

--
Linux-UTF8:   i18n of Linux on all levels
Archive:      http://mail.nl.linux.org/linux-utf8/

RE: Linux console internationalization

Reply via email to