Re: [I18n] Unicode keysym questions

Markus Kuhn Fri, 05 Mar 2004 16:30:07 -0800

Alexander Krauss wrote on 2004-03-05 17:36 UTC:
> while trying to develop a keymap which includes mathematical symbols, I am
> wondering about the exact status of the "UCS keysyms" 0x01000000 and
> above... Are these already standardized? Do any X servers except XF86
> currently use them?


The X.Org Foundation has given me access to their CVS just last week to
ammend the X11 protocol specification and to make this convention
official. I was on a phone conference with them last Monday and they all
agreed that adding the 0x01000000 convention to the standard would be
most sensible.

> And... how exactly should they be interpreted by clients? Should there be
> any difference between for example "eacute" and "U00E9"?

You will have to continue to use the existing keysyms if a character has
one. The +0x01000000 Unicode mapping is exclusively meant for adding any
new keysyms for which there isn't already an existing code. This is to
preserve backwards compatibility.

Having said that, we may decide to retire a couple of the most obscure
of the old keysyms for which simply the semantics has been lost in the
mist of time, and where we are confident that +/- 0 people are actually
using them.

A more tricky question is what to do with the unauthorized addition of
new Latin-8, Vietnamese, and Arabic keysyms a while ago by someone in
XFree86 in the code space that used to be restricted for X.Org.

I am mildly inclined to remove these and replace them with the
equivalent +0x01000000 Unicode mappings, in the interest of keeping
mapping tables small, but I don't know how widely they have become used
since XFree86 added them.

> Should a client
> interpret a U001B as an escape keystroke

None of the values in the range 0x01000000 to 0x01000100 will
technically be assigned keysyms, as all ISO 8859-1 codes have already
other code positions assigned. What your client decides to do if you
receive one of these nevertheless (or any other random unassigned keysym
value) will therefore be outside the X11 protocol specification.

> or are they all by definition
> "characters" and should be interpreted e.g. as "the user wants this thing
> in his UTF-8 document"...

If you want to add such a function to your client, than that is up to
you. However, a correctly configured X11 server should never send out a
0x0100001b keysym. Anything else would be a non-backwards compatible
modification of the X11 protocol, that is likely to find resistance
within X.Org.

> Or is this simply not strictly defined?

We can define it now as strictly as we want and need, because the text
passage that defines that officially will be written over the next few
weeks.

> I also noticed that the Compose-Files of 4.3.0 in UTF-8 locales use the
> Uxxxx keysyms even for characters that have old keysyms (all the
> accented latin-{12...}  chars).

I would argue that any Uxxxx notation used in compose files will have to
go through a special unicode2keysym conversion function that uses a
mapping table. You cannot simply add 0x01000000 to *any* Unicode
character to get its keysym. If XFree86 doesn't do that conversion
correctly at the moment, please file this into the xfree86 bugzilla such
that it will not get lost. Check, what keysym values these compose files
produce on the wire, which is all that counts in the end.

Markus

-- 
Markus Kuhn, Computer Lab, Univ of Cambridge, GB
http://www.cl.cam.ac.uk/~mgk25/ | __oo_O..O_oo__

_______________________________________________
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Re: [I18n] Unicode keysym questions

Reply via email to