At 23:34 03/12/07 +0900, Jungshik Shin wrote:

On Sun, 7 Dec 2003, Peter Jacobi wrote:

> There is some mixup of lang and encoding tagging, which I didn't fully
> understand.

   When lang is not explicitly specified, Mozilla resorts to 'infering'
'langGroup' ('script (group)' would have been a better term) from
the page encoding. Because UTF-8 is script-neutral, it's important to
specify 'lang' explicitly. Your page is in ISO-8859-1 so that without
lang specified, it's assumed to be in 'x-western' lagnGroup(well, Latin
script). Anyway, this behavior slightly changed recently in Windows
version (I forgot when I commited that patch, before or after 1.4)
and each Unicode block is assigned the default 'script'. The way fonts
are picked up by the Xft version of Mozilla makes it harder to do the
equivalent on Linux.

I know that font selection/composition is a terribly difficult business, and hard work, so improving things takes time.

Starting out with certain assumptions about fonts for certain
encodings is clearly very helpful for speed. But I think that
not (correctly) rendering a character that is obviously in
one script and not in another is a bad idea.

Years ago, I developed a very flexible system that was able to
start out with the user-selected font but would use another
font if the first font wasn't able to do the job. The basic
architecture was in many ways very simple, but it took quite
some time to get it right. Once I had this basic architecture,
all kinds of neat things became very easy. For details, see
the paper from the 7th Unicode Conference at:
http://www.ifi.unizh.ch/groups/mml/people/mduerst/papers/PS/FontComposition. ps.gz



Regards, Martin.




Reply via email to