уторак, 30. септембар 2003. 17:58:33 CEST — Owen Taylor написа:

As far as I know, you shouldn't lose that. That's done completely separate from input method handling.


Uhm, I must have been mistaken. I remember having problems with it a week or so ago (with Gtk+ 2.2.4), but I cannot replicate it now.


It must have been some other thing, sorry for pointing in the wrong direction :-)

The main thing you will lose Control-shift-hex-digits Unicode input.

Wow, I didn't even know those existed -- thanks :-)


 A) Operating system independence - GTK+ needs to handle compose
    sequences when not running under X as well, making the operating
    system independent input method the default means it gets tested.

 B) Full set of Unicode compose sequences when not running under
    a Unicode locale

Thanks for explaining those bits.


 C) Get to add some extra neat features like the control-shift
    digits support.

Yes, I never knew about them -- I just love it ;-)


For many scripts, Pango does that.

Yes, I know that, and that's why I was surprised (if not disappointed) that it doesn't handle a really simple case like I'm about to describe.


If you have a font/script combination that you wish that was handled, I can point out where in the code it would need to be handled. Making Unicode normaliation forms NFC and NFD render the same always is suprisingly tricky, but specific instances are not hard to handle.

Okay, the problem is really simple. Serbian language uses a cyrillic alphabet, and four diacritics/accents. In normal texts (meaning, not a linguistic or grammar-related text), accents are not normally used except in a few cases where they're used to distinguish words otherwise the same (like "kod" meaning "at" and "kôd" which came from "code", with "long o", or "da" meaning to, or "dâ" meaning "give" -- seen very frequently in constructs like "he wants to give"). What's worse is that many Serbians are so far using latin "a-circumflex" or "o-circumflex" because their glyphs are the same as those for cyrillic a and o.


This four accents are used only at vowels (in Serbian, those are a, e, i, o and u, cyrillic of course).

Unfortunately, Unicode is not planning to insert these characters precombined (even after several requests), probably basing their reasoning that all Unicode-aware tools support decomposition and composition, or perhaps because of their supposed "goal" of coding just characters, not their variations (which was unachievable from start, because for they've included "fi", "fl" and other ligatures for various compatibility reasons).

So, I'd gladly appreciate the pointer to where should I look for implementing these kind of mappings (four accents over any of five letters). Ok, I've checked out Pango sources and will delve into them later, but I'd still take any pointers you've got.

Thanks again for sheding some light into these issues.

Cheers,
Danilo
_______________________________________________
I18n mailing list
[EMAIL PROTECTED]
http://XFree86.Org/mailman/listinfo/i18n

Reply via email to