On Sun, 14 Dec 2003, Jan Willem Stumpel wrote: > In the Mozilla font preferences you can set font preferences for > Unicode, as well as for specific languages like Western, Japanese, > etc. Am I then correct in assuming that the language-specific > preferences always take priority over the Unicode preferences? > Even when displaying a Web page which has "charset=utf-8" > in the headers?
Yes, it's confusing. I think we should get rid of the font preference entry for Unicode because that's just confusing (there is some use for it at the moment, though). The font selection in Mozilla is strongly influenced by 'langGroup' (had better be 'script' or 'script group'). How is it determined? If there's an explicit specification of the language with 'lang' in html and 'xml:lang' in xml/xhtml in the document [1], it's honored. If not, it's inferred from the document encoding. Obviously, this inference doesn't work at all for utf-8. Currently, Mozilla uses the 'langGroup' corresponding to the current locale for UTF-8 documents. That is, if you run Mozilla under zh_TW.(UTF-8|big5|EUC-TW) locale, the langGroup of utf-8 document is regarded as zh-TW. This doesn't work well and totally breaks down when you have an iso-8859-1 (or any other non-Unicode encoding) documents with a lot of characters outside the repertoire of ISO-8859-1 represented in NCRs. (see http://bugzilla.mozilla.org/show_bug.cgi?id=208479 and http://bugzilla.mozilla.org/show_bug.cgi?id=91190). To work around this problem, Mozilla on Windows maps Unicode code blocks to Mozilla's 'langGroups', which achieves what you asked below. > In other words is there a mechanism (inside > Mozilla) that says > > - hmm... I have to display the character with number 49436 (hex > C11C) here. > - this character is in the range of Korean syllables. > - now has a language-specific Korean font been specified? If so > IÂll use it. > - If not, I use the Unicode font (Bitstream Cyberbit, or > whatever). As I wrote above, on Windows, Mozilla does more or less what you wrote above. Mozilla-X11core and Mozilla-Xft have different font selection mechanisms. Mozilla-Xft is strongly dependent on fontconfig, which gives usually a lot better result than the font selection mechansim of Mozilla-X11core, but that also makes it hard to fix bug 208479 mentioned above. > In other words, are huge "complete Unicode" fonts like Bitstream > Cyberbit or Arialuni (which I promise not to try to use again..) > only used for filling in the gaps where there are no > language-specific fonts available? There does not seem to be much > point in having them, then? You can also configure Mozilla to use those pan-Unicode fonts (or fonts whose coverage is broad enough) for all langGroups you're interested in. > Another question: does Mozilla consider 'Latin Extended A' > characters like Å (o with macron) to be 'Western'? Many Western As I explained above, Mozilla-Win does, but in Mozilla-X11core and Mozilla-Xft, which character belongs to which langGroup is not a function of Unicode code point (as it should be) but a function of the current document encoding and the value of 'lang/xml:lang'. > fonts (like Times New Roman) have them and display them fine. > But for instance Bitstream Vera Serif does not have them, and some > other font (I donÂt know which) is substituted. Which rules are > used for this substitution? Does mozilla look for them in > *another* Western font, or does it look in the 'Unicode' font? Mozilla's font selection mechanism is so complex that I can't explain it in a few words (and it's also platform/toolkit dependent). In Mozilla-Xft, fonts for 'Unicode' langGroup are mostly immaterial, IIRC (I have to look up the code). Mozilla-Xft searches for a font to render a character in the priortized list of fonts returned by fontconfig. Therefore, what fontconfig returns in response to Mozilla's query (that usually specifies 'lang' and 'font family name' but NOT characters to render) determines which font is used to render which character. Mozlla-X11core is a different story. Using 20-year old XLFD makes it very hard to do things right (if you take a look at nsFontMetricsGTK.cpp at http://lxr.mozilla.org, you'll see what I mean) and I guess fonts specified for 'unicode langGroup' is refered to at a certain stage. > > Mozilla's international release notes is your friend although > > we didn't give gory details in the document. In Mozilla, goto ... > Thanks very much for pointing this out. I had found out about the You're welcome :-) > As regards to printing: > I have (and have had for years) just 'lprng' and 'magicfilter' to > print on my old Laserjet IIP. Also xprint works with that (as far > as it works). Is there any point for me (or in general for users > wanting a 100 % Unicode system) in switching to CUPS? I guess magicfilter should be fine especially considering that you have a non-PS printer. CUPS is handy when you have a PS printer that's not quite up-to-date. Mozilla's FT2 printing module produces a level3 PS output that old PS printers (level 1 or level 2) can't deal with. I aksed the CUPS developers to add a filter (PS 'level downgrader') for them so that it's easier to do the level-downgrading with CUPS than with magic filter (it's certainly possible and I had done it before because gs has supported it for a long time, but because it doesn't come by default, you have to tinker with magicfilter set-up). Jungshik [1] It's always a good idea to specify the language of your document no matter what encoding you use not just to help browsers pick a more suitable font but also to enhance the accessibilty of your web pages for the disabled and automated agents. See also http://www.w3.org/International/questions/qa-css-lang.html -- Linux-UTF8: i18n of Linux on all levels Archive: http://mail.nl.linux.org/linux-utf8/
