> What if someday someone implemented locale as an attribute for text
> runs?  Wouldn't that be a straight-forward piece of work to draw mixed
> character sets, simply making the rendering code aware that it should
> switch to a different locale's set of fonts?

The TextRun is internally unicode, so that it is, and I think should 
remain, ignorant of any font-design issues (which is what this is 
really about), and the associated encoding problems; we will 
though need to implement a language attribute one day because of 
the spellcheker.

Multilingual documents with languages that fit within single 8-bit 
encoding are no problem. Thus, you can mix freely western 
European languages, because they use iso-8859-1 encoding, or 
you can mix English and Hebrew, because both of them can be 
expressed using iso8859-8 encoding.

Multilingual documents with languages that do not fit within a 
single 8-bit encoding can be handled in two ways. We could use 
gtk fontsets instead of fonts. Fontset is a collection of fonts 
identical except for the encoding. The gtk code I believe makes the 
decission which of the fonts in the set, if any, contains the 
character in question. I have met the gtk fontsets only briefly while 
working on the utf-8 stuff, but came under the impression that 
moving from fonts to fontsets would require quite a bit of work.

The other way is using unicode (16-bit) fonts which is already in 
place for locales that use utf-8 encoding. This is much cleaner 
solution than fontsets and required a minimal amount of work to put 
in place. Its main disadvantage is inflexibility of coverage: a 
unicode font may contain lot of characters a particular user will 
never use, or it may not be possible to find a font that covers the 
ranges required by a particular user. (On the other hand, fontsets 
are made of fonts with existing encodings, which often significatnly 
overlap, so that if, for instance, you load a fontset consisting of 
iso8859-1 8859-2 and 8859-8, you will have about half of the 
characters in triplicate.)

There are unicode fonts around that cover singnificant chunks of 
the ucs-2 space, such as code2000, but they are unsuprisingly 
huge. But then memory is considered cheep these days, and so I 
would not be surprised if fontsets are going to disappear in 
forseeable future, once unicode fonts, and the support for them will 
become more common.

Tomas

*********************************************
[EMAIL PROTECTED] / www.frydrych.net
PGP keys:  http://www.frydrych.net/contact.html

Reply via email to